UnicodeDecodeError Only With Cx_freeze

September 13, 2022 Post a Comment

I get the error: 'UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position 7338: ordinal not in range(128)' once I try to run the program after I freeze my script with

Solution 1:

Tell us exactly which version of Python on what platform.

Show the full traceback that you get when the error happens. Look at it yourself. What is the last line of your code that appears? What do you think is the bytes string that is being decoded? Why is the ascii codec being used??

Note that automatic conversion of bytes to str with a default codec (e.g. ascii) is NOT done by Python 3.x. So either you are doing it explicitly or cx_freeze is.

Update after further info in comments.

Excel does not save csv files in ASCII. It saves them in what MS calls "the ANSI codepage", which varies by locale. If you don't know what yours is, it is probably cp1252. To check, do this:

>>> import locale; print(locale.getpreferredencoding())
cp1252

If Excel did save files in ASCII, your offending '\xa0' byte would have been replaced by '?' and you would not be getting a UnicodeDecodeError.

Saving your files in UTF-8 would need you to open your files with encoding='utf8' and would have the same problem (except that you'd get a grumble about 0xc2 instead of 0xa0).

You don't need to post all four of your csv files on the web. Just run this little script (untested):

import sys
for filename in sys.argv[1:]:
    for lino, line in enumerate(open(filename), 1):
        if '\xa0' in line:
            print(ascii(filename), lino, ascii(line))

The '\xa0' is a NO-BREAK SPACE aka   ... you may want to edit your files to change these to ordinary spaces.

Probably you will need to ask on the cx_freeze mailing list to get an answer to why this error is happening. They will want to know the full traceback. Get some practice -- show it here.

By the way, "offset 7338" is rather large -- do you expect lines that long in your csv file? Perhaps something is reading all of your file ...

Solution 2:

That error itself indicates that you have a character in a python string that isn't a normal ASCII character:

>>> b'abc\xa0'.decode('ascii')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
UnicodeDecodeError: 'ascii' codec can't decode byte 0xa0 in position 3: ordinal not in range(128)

I certainly don't know why this would only happen when a script is frozen. You could wrap the whole script in a try/except and manually print out all or part of the string in question.

EDIT: here's how that might look

try:
    # ... your script here
except UnicodeDecodeError as e:
    print("Exception happened in string '...%s...'"%(e.object[e.start-50:e.start+51],))
    raise

Solution 3:

fix by set default coding:

reload(sys)
sys.setdefaultencoding("utf-8")

Solution 4:

Use str.decode() function for that lines. And also you can specify encoding like myString.decode('cp1252').

Look also: http://docs.python.org/release/3.0.1/howto/unicode.html#unicode-howto

Python Playground