How To Write Csv In Python Efficiently?
Solution 1:
You can easily write directly a gzip file by using the gzip module :
import gzip
import csv
f=gzip.open("myfile.csv.gz", "w")
csv_w=csv.writer(f)
for row in to_write :
csv_w.writerow(row)
f.close()
Don't forget to close the file, otherwise the resulting csv.gz file might be unreadable.
You can also do it in a more pythonic style :
with gzip.open("myfile.csv.gz", "w") as f :
csv_w = csv.writer(f)
...
which guarantees that the file will be closed.
Hope this helps.
Solution 2:
CSV is CSV and there is not much you can do about it. You can simply gzip it, if you really want to stick with CSV, or you can use some custom format that better fits your needs.
For example you can use a dictionary and export to JSON format, or create a dedicated object that handles your data and pickle it.
When I worked with TF-IDF, I used sqlite (via sqlalchemy) to store documents information, with TF data as dictionary in JSON format. From that I created IDF stats, and later did rest of TFIDF, using numpy
Post a Comment for "How To Write Csv In Python Efficiently?"