3

I want to read bytes 1,2 and 3 from a file. I know it corresponds to a string (in this case it's ELF of a Linux binary header)

Following examples I could find on the net I came up with this:

with open('hello', 'rb') as f:
    f.seek(1)
    bytes = f.read(3)
    string = struct.unpack('s', bytes)
    print st

Looking at the official documentation of struct it seems that passing s as argument should allow me to read a string.

I get the error:

st = struct.unpack('s', bytes)
struct.error: unpack requires a string argument of length 1

EDIT: Using Python 2.7

3
  • 1
    Please specify the Python version you are using. In Python 2 "str" and "bytes" are aliases. In Python 3, file.read() tries to convert bytes into unicode using UTF-8 encoding. Commented May 2, 2014 at 8:47
  • @Grapsus: it is false. f.read() returns bytes on both Python 2 and 3 if the file is opened in a binary mode 'b' (OP uses 'rb' i.e., binary). open in text mode uses locale.getpreferredencoding(False) encoding that sometimes may be utf-8 in Python 3 Commented May 2, 2014 at 12:14
  • don't use bytes name. It shadows the builtin function. Commented May 2, 2014 at 12:24

2 Answers 2

4

In your special case, it is enough to just check

if bytes == 'ELF':

to test all three bytes in one step to be the three characters E, L and F.

But also if you want to check the numerical values, you do not need to unpack anything here. Just use ord(bytes[i]) (with i in 0, 1, 2) to get the byte values of the three bytes.

Alternatively you can use

byte_values = struct.unpack('bbb', bytes)

to get a tuple of the three bytes. You can also unpack that tuple on the fly in case the bytes have nameable semantics like this:

width, height, depth = struct.unpack('bbb', bytes)

Use 'BBB' instead of 'bbb' in case your byte values shall be unsigned.

Sign up to request clarification or add additional context in comments.

4 Comments

Actually, your bytes already is the string you are looking for, i. e. you can just test if bytes == 'ELF': …. Sometimes things are just easy ;-)
why is it a comment: f.read(3) == b'ELF' should be at the top of the answer.
If you need integers instead of bytestrings; you could use bytearray(bytestring) or a = array.array('B'); a.fromfile(file, 3). To read structured data from a binary file, you could use struct.Struct.unpack_from(buf) or file.readinto(ctypes_structure) or (ctypes_structure * N).from_buffer(mmap_file)
At the time of the writing of my comment, only an additional question in a comment stated that constant 'ELF'.
3

In Python 2, read returns a string; in the sense "string of bytes". To get a single byte, use bytes[i], it will return another string but with a single byte. If you need the numeric value of a byte, use ord: ord(bytes[i]). Finally, to get numeric values for all bytes use map(ord, bytes).

In [4]: s = "foo"

In [5]: s[0]
Out[5]: 'f'

In [6]: ord(s[0])
Out[6]: 102

In [7]: map(ord, s)
Out[7]: [102, 111, 111]

1 Comment

I am bit confused on the return type of read in python2.7. I opened a file in a binary mode, and read a byte say byte = f.read(1). When I do print type(byte) it says str. Is "string of bytes" different from normal strings. Which datatype in python maps to "string of bytes"

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.