UnicodeDecodeError Only With Cx_freeze
Solution 1:
Tell us exactly which version of Python on what platform.
Show the full traceback that you get when the error happens. Look at it yourself. What is the last line of your code that appears? What do you think is the bytes
string that is being decoded? Why is the ascii
codec being used??
Note that automatic conversion of bytes
to str
with a default codec (e.g. ascii) is NOT done by Python 3.x. So either you are doing it explicitly or cx_freeze is.
Update after further info in comments.
Excel does not save csv files in ASCII. It saves them in what MS calls "the ANSI codepage", which varies by locale. If you don't know what yours is, it is probably cp1252
. To check, do this:
>>> import locale; print(locale.getpreferredencoding())
cp1252
If Excel did save files in ASCII, your offending '\xa0'
byte would have been replaced by '?' and you would not be getting a UnicodeDecodeError.
Saving your files in UTF-8
would need you to open your files with encoding='utf8'
and would have the same problem (except that you'd get a grumble about 0xc2 instead of 0xa0).
You don't need to post all four of your csv files on the web. Just run this little script (untested):
import sys
for filename in sys.argv[1:]:
for lino, line in enumerate(open(filename), 1):
if '\xa0' in line:
print(ascii(filename), lino, ascii(line))
The '\xa0'
is a NO-BREAK SPACE
aka
... you may want to edit your files to change these to ordinary spaces.
Probably you will need to ask on the cx_freeze mailing list to get an answer to why this error is happening. They will want to know the full traceback. Get some practice -- show it here.
By the way, "offset 7338" is rather large -- do you expect lines that long in your csv file? Perhaps something is reading all of your file ...
Solution 2:
That error itself indicates that you have a character in a python string that isn't a normal ASCII character:
>>> b'abc\xa0'.decode('ascii')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position 3: ordinal not in range(128)
I certainly don't know why this would only happen when a script is frozen. You could wrap the whole script in a try
/except
and manually print out all or part of the string in question.
EDIT: here's how that might look
try:
# ... your script here
except UnicodeDecodeError as e:
print("Exception happened in string '...%s...'"%(e.object[e.start-50:e.start+51],))
raise
Solution 3:
fix by set default coding:
reload(sys)
sys.setdefaultencoding("utf-8")
Solution 4:
Use str.decode()
function for that lines. And also you can specify encoding like myString.decode('cp1252')
.
Look also: http://docs.python.org/release/3.0.1/howto/unicode.html#unicode-howto
Post a Comment for "UnicodeDecodeError Only With Cx_freeze"