How to convert the string representation of a binary string froma text file back into the utf8 encoded text it came from?

Question

I have a word in russian: "привет". It is encoded into utf-8 bytes using 'привет'.encode('utf-8') the result is python bytes object represented as:

b'\xd0\xbf\xd1\x80\xd0\xb8\xd0\xb2\xd0\xb5\xd1\x82'

Now I saved it inside a file and when I read that file I get this string: "b'\\xd0\\xbf\\xd1\\x80\\xd0\\xb8\\xd0\\xb2\\xd0\\xb5\\xd1\\x82'"

How do I decode this string into the original word?

It is not the bytes object I'm trying to decode but a string, so

"b'\\xd0\\xbf\\xd1\\x80\\xd0\\xb8\\xd0\\xb2\\xd0\\xb5\\xd1\\x82'".decode('utf-8')

returns AttributeError: 'str' object has no attribute 'decode'

The way I save it to a file is simply by calling logger.info(x.encode('utf-8')) which is

import logging 
logger = logging.getLogger('GENERATOR_DYNAMICS')

and the way I read a file is

with open('file.log') as f:
    logs = f.readlines()

1st hin on duckduckgo with python decode byte string utf8 was the dupe - you did not really search a lot, did you? Please read How to Ask - first line of duty is doing research. (not my dv btw) — Patrick Artner
– Patrick Artner, Commented Oct 6, 2020 at 14:19
@PatrickArtner is it not the byte object to decode that is the problem, I'm trying to decode a string — Alexander Yalunin
– Alexander Yalunin, Commented Oct 6, 2020 at 14:26
maybe you could edit your post and show how you write to the file, how you read from the file and whats the exact problem with it. If you write (binary) into a file and read (binary) from a file you get the (binary) values back. — Patrick Artner
– Patrick Artner, Commented Oct 6, 2020 at 14:29
if you write the stringrepresentation of you bytearray into a textfile, you need to get it into a bytearray again: import ast + print("b'\\xd0\\xbf\\xd1\\x80\\xd0\\xb8\\xd0\\xb2\\xd0\\xb5\\xd1\\x82'", ast.literal_eval("b'\\xd0\\xbf\\xd1\\x80\\xd0\\xb8\\xd0\\xb2\\xd0\\xb5\\xd1\\x82'" ).decode("utf8")) — Patrick Artner
– Patrick Artner, Commented Oct 6, 2020 at 14:34
@PatrickArtner thank you, that is exactly what I was looking for — Alexander Yalunin
– Alexander Yalunin, Commented Oct 6, 2020 at 14:38

Patrick Artner · Accepted Answer · 2020-10-06 14:47:02Z

2

Your problems are two fold:

you got the stringrepresentation of a bytearray (from a file, but thats kindof irrelevant)
you want to get the bytearray back to utf8 text

So the solution is two steps as well:

import ast

# convert string representation back into binary
string_rep = "b'\\xd0\\xbf\\xd1\\x80\\xd0\\xb8\\xd0\\xb2\\xd0\\xb5\\xd1\\x82'"
as_binary = ast.literal_eval(string_rep)

# convert binary to utf8
text = as_binary.decode("utf8")

to get 'привет' again.

The last part is a duplicate of Python3: Decode UTF-8 bytes converted as string

answered Oct 6, 2020 at 14:47

community wiki

Patrick Artner

Sign up to request clarification or add additional context in comments.

Collectives™ on Stack Overflow

How to convert the string representation of a binary string froma text file back into the utf8 encoded text it came from?

1 Answer 1

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related