Failed To Read Inch Symbol In Pandas Read_csv
I have csv with below details Name,Desc,Year,Location Jhon,12' Main Third ,2012,GR Lew,'291' Line (12,596,3)',2012,GR ,All, 1992,FR ... It is very long file. i just showed prob
Solution 1:
You can do something like this. Try if this works for you:
import pandas as pd
import re
l1=[]
with open('/home/yusuf/Desktop/c1') as f:
headers = f.readline().strip('\n').split(',')
for a in f.readlines():
if a:
q = re.findall("^(\w*),(.*),\s?(\d+),(\w+)",a)
if q:
l1.append(q)
l2 = [list(b[0]) for b in l1]
df = pd.DataFrame(data=l2, columns=headers)
df
Output:
Regex Demo: https://regex101.com/r/AU2WcO/1
Solution 2:
You can't have the separator character inside a field. For example, in
Lew,"291" Line (12,596,3)",2012,GR
Pandas will assume you have 6 fields because you have 5 commas, even if two of them are between quotes. You would need to do some pre-processing of the text file to get rid of this issue, or ask for a different separator character (@ or | seem to work well in my experience.
Pandas has no problems reading the other lines:
import pandas as pd
print pd.read_csv('untitled.txt')
Name Desc Year Location
0 Jhon 12" Main Third 2012 GR
1 NaN All 1992 FR
Post a Comment for "Failed To Read Inch Symbol In Pandas Read_csv"