1

I am having trouble reading the binary file. I have a NumPy array as,

data = array([[ 0.        ,  0.        ,  7.821725  ],
              [ 0.05050505,  0.        ,  7.6358337 ],
              [ 0.1010101 ,  0.        ,  7.453858  ],
              ...,
              [ 4.8989897 ,  5.        , 16.63227   ],
              [ 4.949495  ,  5.        , 16.88153   ],
              [ 5.        ,  5.        , 17.130795  ]], dtype=float32)

I wrote this array to a file in binary format.

file = open('model_binary', 'wb')
data.tofile(file)

Now, I am unable to get back the data from the saved binary file. I tried using numpy.fromfile() but it didn't work out for me.

file = open('model_binary', 'rb')
data = np.fromfile(file)

When I printed the data I got [0.00000000e+00 2.19335211e-13 8.33400000e+04 ... 2.04800049e+03 2.04800050e+03 5.25260241e+07] which is absolutely not what I want.

I ran the following code to check what was in the file,

for line in file:
    print(line)
    break

I got the output as b'\x00\x00\x00\x00\......\c1\x07@\x00\x00\x00\x00S\xc5{@j\xfd\n' which I suppose is in binary format.

I would like to get the array back from the binary file as it was saved. Any help will be appreciated.

6
  • 1
    Why don't you use np.save? Commented Apr 1, 2021 at 15:59
  • @QuangHoang Actually, I have a very large file and hence I wish to save it in binary format. Commented Apr 1, 2021 at 16:05
  • 1
    You only need to specify the filename (or path) as argument when using data.tofile and np.fromfile. But I am not sure why the python file object does not work. Maybe you are not closing the stream after doing the write? Commented Apr 1, 2021 at 16:22
  • You could also specify dtype=float32 keyword when using np.fromfile. Commented Apr 1, 2021 at 16:25
  • 1
    @SurajS np.save saves file to .npy which is binary. Commented Apr 1, 2021 at 16:48

1 Answer 1

1

As Kevin noted, adding the dtype is required. You might also need to reshape (you have 3 columns in your example. So

file = open('model_binary', 'rb')
data = fromfile(file, dtype=np.float32).reshape((-1,3))

should work for you.

As an aside, I think np.save does save to binary format, and should avoid these issues.

Sign up to request clarification or add additional context in comments.

2 Comments

The dtype need to be np.float32 I think.
You are correct - I was copy-pasting from the question.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.