2

I have .dat file that I m trying to read in python. File format is:

1.1 CDR description1 
Field   length(bytes)   Offset 
x   4            0
x1  2            4
x2  1            6
x3  1            7
......
......
......
x4  16          210 
x5  4           226 
x6  70          230
Total length of information     300

These are CDR records and I'm trying to read with the struct module but I can't understand how to use it with my specific file format... Any help?

2
  • 3
    What kind of information is stored in each field? You'll need that to figure out what struct module format codes to use. Commented Jul 25, 2012 at 10:04
  • Integers and letters, binary has 1000 of record of files and I m trying to print them all... Commented Jul 25, 2012 at 10:51

1 Answer 1

3

You need to know what kind of information is stored in each field for the struct module to make sense of each field.

For example, the first field at offset 0 is 4 bytes long, which means it could be an int (ranges from −2,147,483,648 to +2,147,483,647) or it could be a unsigned int instead (ranges from 0 to 4,294,967,295). It could also be a single-precision floating point number.

You probably also need to figure out the endianness of your file format. If this is not explicitly named you need to experiment a little, or infer from context what it would be (a Windows file format is almost always little-endian, for example).

If you want to unpack the first 4 values only, you read the correct number of bytes (8 in your format) and pass this to the struct.unpack function together with a set of formatting characters to tell struct what types to expect. If we assume little-endian data, and the first four fields represent an unsigned int, an unsigned short and two unsigned chars, respectively, you'd use:

with open('something.cdr', 'rb') as data:
    x, x1, x2, x3 = struct.unpack('<IH2B', data.read(8))
Sign up to request clarification or add additional context in comments.

12 Comments

Hmm, strage I have tried to print first 3 field using yours logic, but x, x1, x2 = struct.unpack('<IH2B', data.read(7)) struct.error: unpack requires a string argument of length 8
Make sure you understand the format; the 2 in front of the B means "expect 2 character bytes". If you want to parse just 3 fields, you need to adjust this to '<IHB' instead.
Is there any good turtorial on struck about formats, I m reading this one:docs.python.org/library/struct.html ? I m trying to get let say field 250, I know what I, H and B means, but I cant grasp what logic uses numer in front of B
That document is very complete; the only other useful link I could find for you is doughellmann.com/PyMOTW/struct
Thank you for help, I think I understand now formats, but now I have field with 14 bytes with offset 36 and I don't know how to print that with struck because format table doesn't have standard size more that 8
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.