1

Using python 3.2, I am trying to decode bytes using str(bytes, "cp1251") but I get this error:

Traceback (most recent call last):
  File "C:\---\---\---\---.py", line 4, in <module>
    writetemp.write(str(f.read(), "cp1251"))
  File "C:\Python32\lib\encodings\cp1252.py", line 19, in encode
    return codecs.charmap_encode(input,self.errors,encoding_table)[0]
UnicodeEncodeError: 'charmap' codec can't encode characters in position 19-25: character     maps to <undefined>

As you can see, I specified "cp1251", but it attempts to use "cp1252.py" to decode instead of "cp1251.py", which causes the error, I think. Same thing occurs if I try "Windows-1251" instead of "cp1251".

1 Answer 1

6

Note how what you're getting is a UnicodeEncodeError, not a UnicodeDecodeError. The error doesn't come from your str(f.read(), "cp1251") call. Instead, it comes from the writetemp.write() call.

The str() call decodes the bytes you get from f.read() using cp1251 as the encoding. That works. That gives you a string (which is unicode, in Python 3.) writetemp.write() then has to turn the string back into bytes, by encoding it. It does that using the encoding you passed when opening writetemp, or the default IO encoding (which Python tries to guess at based on various things.) You can see which encoding that is by looking at the encoding attribute of the file object. You'll probably find it is cp1252. If you want to write in a particular encoding, don't rely on Python guessing at it; explicitly specify the encoding when you open the file.

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.