104

I'm trying to read a BMP file in Python. I know the first two bytes indicate the BMP firm. The next 4 bytes are the file size. When I execute:

fin = open("hi.bmp", "rb")
firm = fin.read(2)  
file_size = int(fin.read(4))  

I get:

ValueError: invalid literal for int() with base 10: 'F#\x13'

What I want to do is reading those four bytes as an integer, but it seems Python is reading them as characters and returning a string, which cannot be converted to an integer. How can I do this correctly?

2
  • 2
    If your goal is to use the bitmap instead of spending time writing your own BMP library (not that that doesn't sound like fun...) you can use PIL pythonware.com/products/pil which you may already have installed. Try: import Image Commented Jul 22, 2009 at 7:24
  • 9
    Thanks Jared, but I wanted to read the bmp manually only to have fun! :) Commented Jul 22, 2009 at 7:33

7 Answers 7

143

The read method returns a sequence of bytes as a string. To convert from a string byte-sequence to binary data, use the built-in struct module: http://docs.python.org/library/struct.html.

import struct

print(struct.unpack('i', fin.read(4)))

Note that unpack always returns a tuple, so struct.unpack('i', fin.read(4))[0] gives the integer value that you are after.

You should probably use the format string '<i' (< is a modifier that indicates little-endian byte-order and standard size and alignment - the default is to use the platform's byte ordering, size and alignment). According to the BMP format spec, the bytes should be written in Intel/little-endian byte order.

Sign up to request clarification or add additional context in comments.

8 Comments

Instead of writing i = struct.unpack(...)[0] I often write i, = struct.unpack(...)
@Otto Is there any reason you prefer one way over the other? Is there any logical difference?
I find it very surprising that there isn't a built-in function to read integers (or Shorts etc) from a file in Python. I'm no Java expert but I believe it has native functions such as readUnsignedShort() to do this.
@codeape Could you define what the [0] is doing please or at least what type of language element it is. It isn't immediately apparent and it is almost impossible to search for in the Python documentation.
For lists and tuples, obj[N] means: get the Nth element of obj. See docs.python.org/tutorial/introduction.html#lists
|
70

An alternative method which does not make use of 'struct.unpack()' would be to use NumPy:

import numpy as np

f = open("file.bin", "r")
a = np.fromfile(f, dtype=np.uint32)

'dtype' represents the datatype and can be int#, uint#, float#, complex# or a user defined type. See numpy.fromfile.

Personally prefer using NumPy to work with array/matrix data as it is a lot faster than using Python lists.

3 Comments

File opening can be skiped : a = np.fromfile('file.bin', dtype=np.uint32)
In my case this didn't directly work. Depending on your encoding you may try more esoteric dtypes such as : np.fromfile( file, dtype='>i2') , > or < determine big or little endian. Depending on the number of byte you can go with i2 or i4
To me, the idea of using such a huge and complicated package as NumPy for so low-level and elementary operations is very much of an overkill.
38

As of Python 3.2+, you can also accomplish this using the from_bytes native int method:

file_size = int.from_bytes(fin.read(2), byteorder='big')

Note that this function requires you to specify whether the number is encoded in big- or little-endian format, so you will have to determine the endian-ness to make sure it works correctly.

6 Comments

in python3, int's are dynamically sized (it's the same implementation as Python2's long; this is also sometimes referred to as "big int's"), which I believe is the motivation for adding this int method in the first place. docs.python.org/3/library/….
So why 2? How do you know?
ohhh the number 2's just from the OP; it's not a magic number or anything. this could be generalized by replacing 2 with some positive integer variable n and it'd work all the same
I wonder why you used big-endian byteorder, when for most people correct option would be "little". Not to mention that BMP specification requires little-endianness.
Well, most of modern architectures and OSes are little-endian, from what I know (admittedly, not too much). So when I tried your approach, I got weird result. And it might confuse people who aren't ready to investigate and cost you some reputation points =)
|
6

Except struct you can also use array module

import array
values = array.array('l') # array of long integers
values.fromfile(fin, 1) # read 1 integer
file_size  = values[0]

3 Comments

Good point. But this solution is not as flexible as that of the struct module, since all elements read through values.read() must be long integers (it is not convenient to read a long integer, a byte, and then a long integer, with the array module).
I agree. array is an efficient way to read a binary file but not very flexible when we have to deal with structure, as you correctly mentioned.
array.read is deprecated in favor of array.fromfile since 1.51
4

As you are reading the binary file, you need to unpack it into a integer, so use struct module for that

import struct
fin = open("hi.bmp", "rb")
firm = fin.read(2)  
file_size, = struct.unpack("i",fin.read(4))

1 Comment

struct.unpack returns a tuple
1

When you read from a binary file, a data type called bytes is used. This is a bit like list or tuple, except it can only store integers from 0 to 255.

Try:

file_size = fin.read(4)
file_size0 = file_size[0]
file_size1 = file_size[1]
file_size2 = file_size[2]
file_size3 = file_size[3]

Or:

file_size = list(fin.read(4))

Instead of:

file_size = int(fin.read(4))

Comments

-1

Here's a late solution but I though it might help.

fin = open("hi.bmp", "rb")
firm = fin.read(2)
file_size = 0
for _ in range(4):  
    (file_size << 8) += ord(fin.read(1))

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.