6
_f = open("c:/go-next.png", "rb")
data = _f.read()
_f.close()
data.encode("utf-8")

# Error: UnicodeDecodeError: file <maya console> line 1: ascii # 

As you see I open a image file, and the data is type. But I have to convert it to utf-8. Maybe binary data has some extra char (or not), it conflict with conversion. Is there any way to solve it?

4
  • 1
    I don't mean to be blunt, but converting a PNG to UTF8 doesn't make any sense. PNG is an image format. UTF8 is a text encoding. Can you explain more what it is you are trying to do? Commented Feb 7, 2013 at 4:10
  • I try to POST my data to SERVER within HTTP. To do that, I refer to http://www.doughellmann.com/PyMOTW/urllib2/#uploading-files and make it well. But, to POST special character (like Korean), I have to send it in "UTF-8" format. Server administrator tell me to do that :) Commented Feb 7, 2013 at 9:18
  • 1
    @Hyun-geunKim using code in that link, you would add_file for the image, and add_field for text. .add_file("my_image", "go-next.png", open("c:/go-next.png", "rb"), "image/png") and for text .add_field("key", "text") Commented Feb 7, 2013 at 10:18
  • Please see timeartist answer. To convert binary data to utf-8 (which is an encoding for text) you need a format in between. For example, using base64: file_data_b64 = b64encode(file_data).decode('utf-8') And then you can get back to the binary format when you save the file to avoid data loss: a_file.write(b64decode(file_data)) Decoding with another text encoding , like latin-1, before encoding to utf-8 is dangerous because not every binary file (for example, PNG) would be suitable for text encoding (probably data.decode('latin-1').encode("utf-8") is NOT what you want) Commented Sep 24, 2023 at 15:34

5 Answers 5

7

You can always map a str to unicode using the latin-1 codec. Once you have a unicode, you can always encode it in utf-8:

data.decode('latin-1').encode("utf-8")
Sign up to request clarification or add additional context in comments.

6 Comments

While this is true, the result has no meaning.
Without meaning? A supercomputer was once asked, "What is the meaning of life?" I'm pretty sure it answered, '42'.decode('latin1').encode('utf-8'). :-)
@IgnacioVazquez-Abrams The latin-1 codec maps codepoints "0-255 to the bytes 0x0-0xff": docs.python.org/2/library/codecs.html#encodings-and-unicode So, this does exactly what you'd expect.
@yingted: I'm pretty sure that converting 0xd0 to 0xc3 0x90 atc. will corrupt the image, which is hardly what you'd want.
@IgnacioVazquez-Abrams I just hit this. I have to transfer a blob across a UTF-8 channel, but I don't have a base64 decoder on the receiving side. This would be the correct answer if the question made any sense.
|
4

Text encodings only apply to text. Do not attempt to use them on binary data.

2 Comments

Yes, I and my colleague set up another POST module.
b'DRIVES STORE TEXT AS BYTES'.decode ('utf8')
4

What you're trying to accomplish can probably be achieved by base64 encoding it.

 import base64
 encoded = base64.b64encode(image_binary_data)

Comments

1

Encoding means converting strings to storable bytes.
And Decoding means converting bytes to readable strings.

The data in your code is already encoded.

Comments

1

Image cannot be converted into something like charters in utf8.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.