3

I am reading a binary file using the following method

numpy.fromfile(file, dtype=)

The binary file has multiple types present and i know the organization. Therefore I have defined a dtype array as follows:

dtypearr = [('a','i4',1),('b','S1',8),('c','i4',1),
 ('d','i4',1),('e','S1',8)]

This dtype array is saying that the first value of the binary file is one integer followed by 8 characters etc...

The problem i am having is that the binary file is not the size of dtypearr. The binary file has the structure defined in dtypearr repeating n times.

So far, what i have done is repeat the dtypearr with new field names until it is the same size as the binary file.

However, i was hoping that somehow, I could achieve this goal without repeating dtypearr. Instead I want an array to be stored in each field. For example, i want structuredarray['a'] or structuredarray['b'] to give me an array instead of a single value.

Edit

Note that:

numpy.fromfile(file, dtype=dtypearr)

Achieves what i want when the pattern is exactly the same. The solution below also works.

However, the pattern in the binary file i mentioned isn't exactly repeating. For example, there is a header portion and multiple subsections. And each subsection has its own repeating pattern. f.seek() will work for the last subsection, but not the subsections before.

3
  • What you are asking seems to be little confusing? Please provide some sample data, if in binary format, then some made-up data. Also, let us know what code you have tried and where it is failing? Commented May 11, 2016 at 13:43
  • I think you need to work with basic file reading routines, working through headers, sections, repeated patterns on your own. Array creation will be a last step. Commented May 11, 2016 at 15:00
  • Yes that was how my code originally was, but i think that i will have to call numpy.fromfile for each subsection. With a combination of f.seek and the "count" option of numpy.fromfile Commented May 11, 2016 at 15:06

1 Answer 1

4

Try:

import numpy as np
import string

# Create some fake data
N = 10
dtype = np.dtype([('a', 'i4'), ('b', 'S8'), ('c', 'f8')])
a = np.zeros(N, dtype)
a['a'] = np.random.random_integers(0,3, N)
a['b'] = np.array([x for x in string.ascii_lowercase[:N]])
a['c'] = np.random.normal(size=(N,))

# Write to a binary file
a.tofile('test.dat')

# Read data into new array
b = np.fromfile('test.dat', dtype=dtype)

The arrays a and b are identical (i.e np.all(a['a'] == b['a']) is True):

for col in a.dtype.names:
    print col, np.all(a[col] == b[col])

# Prints:
# a True
# b True
# c True

Update:

If you have header information, you can first open the file, seek to the starting point of the data and then read. For example:

f = open("test.dat", "rb")
f.seek(header_size)
b = np.fromfile(f, dtype=dtype)
f.close() 

You have to know the size (header_size), but then you should be good. If there are subsections, you can supply a count of the number of items to grab. I haven't tested if the counts works. If you are not bound to this binary format, I would recommend using something like hdf5 to store multiple arrays in a single file.

Sign up to request clarification or add additional context in comments.

4 Comments

Awesome!, but there is some additional important information that i forgot to mention, i will edit my post.
Updated answer to reflect needing to skip a header.
Yes, i was expecting that, so i was writing another edit. Thank you for the prompt response. The problem is that there are multiple subsections, and f.seek will only work with the last one. Sorry, I didn't think i needed to go into this much detail.
Ok, i haven't looked into count that much. But i will try it out. Seems like it would work. But it means that i would have to call np.fromfile for each subsection right? I am reading a binary file from a software, so i cant really control that. But i have considered, converting the binary to hdf5 and then reading that.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.