Skip to content Skip to sidebar Skip to footer

Failed To Read Inch Symbol In Pandas Read_csv

I have csv with below details Name,Desc,Year,Location Jhon,12' Main Third ,2012,GR Lew,'291' Line (12,596,3)',2012,GR ,All, 1992,FR ... It is very long file. i just showed prob

Solution 1:

You can do something like this. Try if this works for you:

import pandas as pd
import re

l1=[]
with open('/home/yusuf/Desktop/c1') as f:
    headers = f.readline().strip('\n').split(',')
    for a in f.readlines():
        if a:
            q = re.findall("^(\w*),(.*),\s?(\d+),(\w+)",a)
            if q:
                l1.append(q)

l2 = [list(b[0]) for b in l1]

df = pd.DataFrame(data=l2, columns=headers)
df

Output:

enter image description here

Regex Demo: https://regex101.com/r/AU2WcO/1

Solution 2:

You can't have the separator character inside a field. For example, in

Lew,"291" Line (12,596,3)",2012,GR

Pandas will assume you have 6 fields because you have 5 commas, even if two of them are between quotes. You would need to do some pre-processing of the text file to get rid of this issue, or ask for a different separator character (@ or | seem to work well in my experience.

Pandas has no problems reading the other lines:

import pandas as pd
print pd.read_csv('untitled.txt')

   Name             Desc  Year Location
0  Jhon  12" Main Third   2012       GR
1   NaN              All  1992       FR

Post a Comment for "Failed To Read Inch Symbol In Pandas Read_csv"