0

How can I convert bytes to string without changing data ?
E.g
Input:
file_data = b'\xb4\xeb7s\x14q[\xc4\xbb\x8e\xd4\xe0\x01\xec+\x8f\xf8c\xff\x00 \xeb\xff'

Output:
'\xb4\xeb7s\x14q[\xc4\xbb\x8e\xd4\xe0\x01\xec+\x8f\xf8c\xff\x00 \xeb\xff'

I want to write an image data using StringIO with some additional data, Below is my code snippet,

img_buf = StringIO()
f = open("Sample_image.jpg", "rb")
file_data = f.read()
img_buf.write('\r\n' + file_data + '\r\n')

This works fine with python 2.7 but I want it to be working with python 3.4.
on read operation file_data = f.read() returns bytes object data something like this

b'\xb4\xeb7s\x14q[\xc4\xbb\x8e\xd4\xe0\x01\xec+\x8f\xf8c\xff\x00 \xeb\xff'  

While writting data using img_buf it accepts only String data, so unable to write file_data with some additional characters. So I want to convert file_data as it is in String object without changing its data. Something like this

'\xb4\xeb7s\x14q[\xc4\xbb\x8e\xd4\xe0\x01\xec+\x8f\xf8c\xff\x00 \xeb\xff'  

so that I can concat and write the image data.

I don't want to decode or encode data. Any suggestions would be helpful for me. thanks in advance.

8
  • 2
    Are you asking how to convert the bytes data to a string? Just my_string = file_data.decode('utf-8')? (Realize that decoding is literally converting bytes to a string... are you sure you don't want to decode it?) Commented Mar 23, 2018 at 15:35
  • my_string = file_data.decode('utf-8') Gives error as UnicodeDecodeError: 'utf-8' codec can't decode byte 0xb4 in position 0: invalid start byte Commented Mar 23, 2018 at 15:38
  • I won't try to close this as a duplicate, because I don't yet understand what you want, but does stackoverflow.com/questions/13837848/… solve your problem? Commented Mar 23, 2018 at 15:40
  • 1
    Actually, I'll step it back a bit. Please read How to Ask. Your title mentions "bytes data of an image". This is presumably 64-bit encoded image data that you're loading. With the appropriate functions, you can convert this data to an image. Why do you want it to be a string? Can you please edit your question to tell us what exactly you're trying to do with it? Commented Mar 23, 2018 at 15:45
  • 1
    You'll need to explain better what you mean by "without changing data". The data doesn't change; simply the way it is interpretted does when you use decode. I'll just leave this here as recommended reading, and wish you good luck. docs.python.org/3.3/howto/unicode.html Commented Mar 23, 2018 at 15:55

1 Answer 1

0

It is not clear what kind of output you desire. If you are interested in aesthetically translating bytes to a string representation without encoding:

s = str(file_data)[1:]
print(s)
# '\xb4\xeb7s\x14q[\xc4\xbb\x8e\xd4\xe0\x01\xec+\x8f\xf8c\xff\x00 \xeb\xff'

This is the informal string representation of the original byte string (no conversion).


Details

The official string representation looks like this:

s
# "'\\xb4\\xeb7s\\x14q[\\xc4\\xbb\\x8e\\xd4\\xe0\\x01\\xec+\\x8f\\xf8c\\xff\\x00 \\xeb\\xff'"

String representation handles how a string looks. Double escape characters and double quotes are implicitly interpreted in Python to do the right thing so that the print function outputs a formatted string.

String intrepretation handles what a string means. Each block of characters means something different depending on the applied encoding. Here we interpret these blocks of characters (e.g. \\xb4, \\xeb, 7, s) with the UTF-8 encoding. Blocks unrecognized by this encoding are replaced with a default character, �:

file_data.decode("utf-8", "replace")
# '��7s\x14q[���\x01�+��c�\x00 ��'

Converting from bytes to strings is required for reliably working with strings.

In short, there is a difference in string output between how it looks (representation) and what it means (interpretation). Clarify which you prefer and proceed accordingly.

Addendum

If your question is "how do I concatenate a byte string?", here is one approach:

buffer = io.BytesIO()
with buffer as f:
    f.write(b"\r\n")
    f.write(file_data)
    f.write(b"\r\n")
    print(buffer.getvalue())
# b'\r\n\xb4\xeb7s\x14q[\xc4\xbb\x8e\xd4\xe0\x01\xec+\x8f\xf8c\xff\x00 \xeb\xff\r\n'

Equivalently:

buffer = b""
buffer += b"\r\n"
buffer += file_data
buffer += b"\r\n"
buffer
# b'\r\n\xb4\xeb7s\x14q[\xc4\xbb\x8e\xd4\xe0\x01\xec+\x8f\xf8c\xff\x00 \xeb\xff\r\n'
Sign up to request clarification or add additional context in comments.

2 Comments

I tried both of ways you mentioned in answer. file_data = str(file_data)[2:-1] returns String like '\\xb4\\xeb7s\\x14q[\\xc4\\xbb\\x8e\\xd4\\xe0\\x01\\xec+\\x8f\\xf8c\\xff\\x00 \\xeb\\xff' which contains escaped backslashes and I dont want escaped backslashes. I have added brief explanation in question for more information. Thanks @pylang
s has escaped characters because it is a string. To my knowledge, you cannot change that (and probably should not try). It seems like you just want to append data to a byte string, correct? If so, is it necessary to have a string result? How about modifying a byte string instead?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.