3

I use Fortran to write data to a binary file in the following format

open(unit=99,form='unformatted',status='unknown')
do i=1,N
write(99) (i),(A(i)),(B(i))
enddo
close(99)

Here, A and B are double precision arrays. How can this binary data file be read in python?

PS: So far, I tried the following steps to read data.

with open('fort.99', 'rb') as binfile:
    data=binfile.read()

Though the file reading is succesful, I am unable to use the numpy functions on this data, which is my preferred choice for most of the analyses.

5
  • Please read documentation - mode b for binary files, t (default) for text. open(file_path, mode="rb") After you open it, you can access its contents like any other file, just the type will be different. Just do .decode() to convert to string. Commented Feb 18, 2020 at 15:41
  • How does that compile without initialising A, B, N or specifying format for formatted output? Commented Feb 18, 2020 at 16:09
  • Why do you refer to a file written formatted as binary? formatted means it has been formatted as ASCII text, not binary. Commented Feb 18, 2020 at 22:00
  • @MarkSetchell My apologies. The format is unformatted. It was a typographical error. There are many steps involved in the calculation of A& B which I have omitted here for clarity. N is an integer (=128) and A, B are double precision arrays of size N. Commented Feb 19, 2020 at 7:48
  • What did you try so far? Commented Feb 19, 2020 at 7:49

1 Answer 1

6

Updated Answer

I played around with this some more in Numpy and you can read the file much more cleanly like this - the information below still applies and explains how it works:

import numpy as np

# Read file and reshape as "records" of 28 bytes each
n = np.fromfile('fort.99',dtype=np.uint8).reshape(-1,28)
I = n[:,4:8].copy().view(np.int32)     # pick bytes 4-8, make contiguous and view as integer
A = n[:,8:16].copy().view(np.float64)  # pick bytes 8-16, make contiguous and view as float64
B = n[:,16:24].copy().view(np.float64) # pick bytes 16-24, make contiguous and view as float64

Original Answer

I changed your program to have some recognisable data:

      program thing
      REAL*8, DIMENSION(128) :: A,B
      INTEGER N
      N=128

      open(unit=99,form='unformatted',status='unknown')
      do i=1,N
      A(i)=100.0 * i
      B(i)=-1000.0 * i
      write(99) (2*i),(A(i)),(B(i))
      enddo
      close(99)
      end program

Then I looked at the file size and it was 3,584 bytes, so I divided that by 128 to get the bytes per fortran WRITE as 28.

So, inspected the data with xxd as follows:

xxd -c 28 fort.99

00000000: 1400 0000 0200 0000 0000 0000 0000 5940 0000 0000 0040 8fc0 1400 0000  ..............Y@.....@......
0000001c: 1400 0000 0400 0000 0000 0000 0000 6940 0000 0000 0040 9fc0 1400 0000  ..............i@.....@......
00000038: 1400 0000 0600 0000 0000 0000 00c0 7240 0000 0000 0070 a7c0 1400 0000  [email protected]......
00000054: 1400 0000 0800 0000 0000 0000 0000 7940 0000 0000 0040 afc0 1400 0000  ..............y@.....@......
00000070: 1400 0000 0a00 0000 0000 0000 0040 7f40 0000 0000 0088 b3c0 1400 0000  .............@.@............
0000008c: 1400 0000 0c00 0000 0000 0000 00c0 8240 0000 0000 0070 b7c0 1400 0000  [email protected]......
000000a8: 1400 0000 0e00 0000 0000 0000 00e0 8540 0000 0000 0058 bbc0 1400 0000  [email protected]......
000000c4: 1400 0000 1000 0000 0000 0000 0000 8940 0000 0000 0040 bfc0 1400 0000  ...............@.....@......
000000e0: 1400 0000 1200 0000 0000 0000 0020 8c40 0000 0000 0094 c1c0 1400 0000  ............. .@............
000000fc: 1400 0000 1400 0000 0000 0000 0040 8f40 0000 0000 0088 c3c0 1400 0000  .............@.@............

So there are 28 bytes per fortran WRITE and there are 4 bytes of indexing at the start and end of each record.

Then I decoded them like this:

#!/usr/bin/env python3

import struct

# Important to open the file in binary mode, 'b'
with open('fort.99','rb') as f:
     # You should really read until error in case there are more or fewer than 128 records one day - but you know there are 128

    for i in range(128):
        # Read one record, i.e. output of one fortran WRITE
        record = f.read(28)
        # Unpack 2 INTEGERs, 2 DOUBLEs and an INTEGER
        _, I, A, B, _ = struct.unpack('2i2di',record)
        print(I,A,B)

Output

2 100.0 -1000.0
4 200.0 -2000.0
6 300.0 -3000.0
8 400.0 -4000.0
10 500.0 -5000.0
12 600.0 -6000.0
14 700.0 -7000.0
16 800.0 -8000.0
18 900.0 -9000.0
20 1000.0 -10000.0
22 1100.0 -11000.0
24 1200.0 -12000.0
26 1300.0 -13000.0
28 1400.0 -14000.0
30 1500.0 -15000.0
32 1600.0 -16000.0
34 1700.0 -17000.0
36 1800.0 -18000.0
38 1900.0 -19000.0
40 2000.0 -20000.0
42 2100.0 -21000.0
44 2200.0 -22000.0
46 2300.0 -23000.0
48 2400.0 -24000.0
50 2500.0 -25000.0
52 2600.0 -26000.0
54 2700.0 -27000.0
56 2800.0 -28000.0
58 2900.0 -29000.0
60 3000.0 -30000.0
62 3100.0 -31000.0
64 3200.0 -32000.0
66 3300.0 -33000.0
68 3400.0 -34000.0
70 3500.0 -35000.0
72 3600.0 -36000.0
74 3700.0 -37000.0
76 3800.0 -38000.0
78 3900.0 -39000.0
80 4000.0 -40000.0
82 4100.0 -41000.0
84 4200.0 -42000.0
86 4300.0 -43000.0
88 4400.0 -44000.0
90 4500.0 -45000.0
92 4600.0 -46000.0
94 4700.0 -47000.0
96 4800.0 -48000.0
98 4900.0 -49000.0
100 5000.0 -50000.0
102 5100.0 -51000.0
104 5200.0 -52000.0
106 5300.0 -53000.0
108 5400.0 -54000.0
110 5500.0 -55000.0
112 5600.0 -56000.0
114 5700.0 -57000.0
116 5800.0 -58000.0
118 5900.0 -59000.0
120 6000.0 -60000.0
122 6100.0 -61000.0
124 6200.0 -62000.0
126 6300.0 -63000.0
128 6400.0 -64000.0
130 6500.0 -65000.0
132 6600.0 -66000.0
134 6700.0 -67000.0
136 6800.0 -68000.0
138 6900.0 -69000.0
140 7000.0 -70000.0
142 7100.0 -71000.0
144 7200.0 -72000.0
146 7300.0 -73000.0
148 7400.0 -74000.0
150 7500.0 -75000.0
152 7600.0 -76000.0
154 7700.0 -77000.0
156 7800.0 -78000.0
158 7900.0 -79000.0
160 8000.0 -80000.0
162 8100.0 -81000.0
164 8200.0 -82000.0
166 8300.0 -83000.0
168 8400.0 -84000.0
170 8500.0 -85000.0
172 8600.0 -86000.0
174 8700.0 -87000.0
176 8800.0 -88000.0
178 8900.0 -89000.0
180 9000.0 -90000.0
182 9100.0 -91000.0
184 9200.0 -92000.0
186 9300.0 -93000.0
188 9400.0 -94000.0
190 9500.0 -95000.0
192 9600.0 -96000.0
194 9700.0 -97000.0
196 9800.0 -98000.0
198 9900.0 -99000.0
200 10000.0 -100000.0
202 10100.0 -101000.0
204 10200.0 -102000.0
206 10300.0 -103000.0
208 10400.0 -104000.0
210 10500.0 -105000.0
212 10600.0 -106000.0
214 10700.0 -107000.0
216 10800.0 -108000.0
218 10900.0 -109000.0
220 11000.0 -110000.0
222 11100.0 -111000.0
224 11200.0 -112000.0
226 11300.0 -113000.0
228 11400.0 -114000.0
230 11500.0 -115000.0
232 11600.0 -116000.0
234 11700.0 -117000.0
236 11800.0 -118000.0
238 11900.0 -119000.0
240 12000.0 -120000.0
242 12100.0 -121000.0
244 12200.0 -122000.0
246 12300.0 -123000.0
248 12400.0 -124000.0
250 12500.0 -125000.0
252 12600.0 -126000.0
254 12700.0 -127000.0
256 12800.0 -128000.0

As you can see there are record markers 1400 0000 on each record, meaning the records are 20 bytes long (1 fortran INTEGER@4, 2 fortran DOUBLE@8) with 4 bytes of record marker at each end - which I read into a variable called _ and discarded in Python. If you don't want that, you need to use DIRECT /STREAM output in fortran - but I don't know if you can control that at the other end of your processing chain. See here for a description.

Keywords: fortran, Python, binary, unformatted, xxd, dump, records, view as integer, view as float, view as float64, view as double

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.