11

File content:

40 13 123
89 123 2223
4  12  0

I need to store the whole .txt file as a binary array so that I can send it later to the server side which expects a binary input.


I've looked at Python's bytearray documentation. I quote:

Return a new array of bytes. The bytearray type is a mutable sequence of integers in the range 0 <= x < 256. It has most of the usual methods of mutable sequences, described in Mutable Sequence Types, as well as most methods that the bytes type has, see Bytes and Byte Array Methods.


My numbers are greater than 256, I need a bytearray data structure for numbers that are greater than 256.

6
  • Do you mean you want the text representation stored as an array of int32? Commented Mar 10, 2017 at 9:21
  • 1
    @xtofl yes. But my problem is that after I do so to each number I would like to have it in a binary object ? if I access first line I get the first number in binary representation. Commented Mar 10, 2017 at 9:23
  • 1
    Do you have an example of what you want exactly? "101010" isn't a binary object, it's a string representing 42 in binary. 42, as an integer, is already stored as binary for Python. Commented Mar 10, 2017 at 9:24
  • 1
    @EricDuminil yes sir, sorry for my bad explenation. A byte is 8 bits, and it can be send as binary data. I need to have a sequence of many numbers in binary so that I know when to stop reading to know my first number, second number and so on. One way is as xtofl said to represent in 32 bits. But I can't make bytearray store more than 8 bits as anynumber greater than 256 can't be stored in. Commented Mar 10, 2017 at 9:36
  • 1
    So just use an int array and be done with it. Doesn't the server specify exactly which format it expects? Commented Mar 10, 2017 at 9:50

5 Answers 5

8

you might use the array/memoryview approach

import array
a = array.array('h', [10, 20, 300]) #assume that the input are short signed integers
memv = memoryview(a)
m = memv.cast('b') #cast to bytes
m.tolist()

this then gives [10, 0, 20, 0, 44, 1]

Depending on the usage, one might also do:

L = array.array('h', [10, 20, 300]).tostring()
list(map(ord, list(L)))

this also gives [10, 0, 20, 0, 44, 1]

Sign up to request clarification or add additional context in comments.

8 Comments

Nice! I see there is also array.from_list(...).
TypeError: cannot make memory view because object does not have the buffer interface I read that array.array object supports this only on python 3 ? stackoverflow.com/questions/4877866/…
@xtofl it works fine on Python 3, but unfortunately it looks like applying memoryview on an array is not supported by Python 2.7 - bugs.python.org/issue17145
@TonyTannous then I would just change 'h' to 'd', i.e., replace short integer with integer...
I removed comment. It works now. Just give me a couple of minutes please. Perfect!
|
3

You can read in the text file and convert each 'word' to an int:

with open(the_file, 'r') as f:
    lines = f.read_lines()
    numbers = [int(w) for line in lines for w in line.split()]

Then you have to pack numbers into a binary array with struct:

binary_representation = struct.pack("{}i".format(len(numbers)), *numbers)

If you want these data to be written in binary format, you have to specify so when opening the target file:

with open(target_file, 'wb') as f:
   f.write(binary_representation)

2 Comments

I agree that this double list comprehension syntax would be more readable, but unfortunately, it doesn't work. Also, if you iterate on a string, you get the characters, not the words.
Busted. It's the other way around. Thanks
2

Not bytearray

From the bytearray documentation, it is just a sequence of integers in the range 0 <= x < 256.

As an example, you can initialize it like this :

bytearray([40,13,123,89,123,4,12,0])
# bytearray(b'(\r{Y{\x04\x0c\x00')

Since integers are already stored in binary, you don't need to convert anything.

Your problem now becomes : what do you want to do with 2223 ?

>>> bytearray([2223])
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: byte must be in range(0, 256)

uint32 or int32 array?

To read one file, you could use :

import re
with open('test.txt') as f:
    numbers = [int(w) for line in f for w in re.split(' +', line)]
    print numbers
    #[40, 13, 123, 89, 123, 2223, 4, 12, 0]

Once you have an integer list, you could choose the corresponding low-level Numpy data structure, possibly uint32 or int32.

Comments

1

I needed this for a server-client module, which one of its function required a binary input. Different thrift types can be found here.

Client

myList = [5, 999, 430, 0]
binL = array.array('l', myList).tostring()
# call function with binL as parameter

In Server I reconstructed the list

k = list(array.array('l', binL))
print(k)
[5, 999, 430, 0]

Comments

0

Try this:

input.txt:

40 13 123
89 123 2223
4  12  0

Code to parse input to output:

with open('input.txt', 'r') as _in:
    nums = map(bin, map(int, _in.read().split())) # read in the whole file, split it into a list of strings, then convert to integer, the convert to binary string

with open('output.txt', 'w') as out:
          out.writelines(map(lambda b: b + '\n', map(lambda n: n.replace('0b', ''), nums))) # remove the `0b` head from the binstrings, then append `\n` to every string in the list, then write to file

output.txt:

101000
1101
1111011
1011001
1111011
100010101111
100
1100
0

Hope it helps.

4 Comments

Thanks, but I don't want to write it to a new file as binary, I need to hold it in a binary-object. Like bytearray or so. But I appreciate your effort. Thanks.
@TonyTannous: Then your question doesn't make sense, and it looks like you don't know exactly what you want to send.
@EricDuminil he knows, but does not have the proper terms for it.
@TonyTannous Then just use a list of ints. Later you can convert it to anything you want :D

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.