38

Input:

mystr = "100110"

Desired output numpy array:

mynumpy == np.array([1, 0, 0, 1, 1, 0])

I have tried:

np.fromstring(mystr, dtype=int, sep='')

but the problem is I can't split my string to every digit of it, so numpy takes it as an one number. Any idea how to convert my string to numpy array?

3 Answers 3

52

list may help you do that.

import numpy as np

mystr = "100110"
print np.array(list(mystr))
# ['1' '0' '0' '1' '1' '0']

If you want to get numbers instead of string:

print np.array(list(mystr), dtype=int)
# [1 0 0 1 1 0]
Sign up to request clarification or add additional context in comments.

1 Comment

It should be noted that for large inputs, grc's first method using np.fromstring('...', np.int8) is much faster. Creating a list from the (large) string is unnecessary.
32

You could read them as ASCII characters then subtract 48 (the ASCII value of 0). This should be the fastest way for large strings.

>>> np.fromstring("100110", np.int8) - 48
array([1, 0, 0, 1, 1, 0], dtype=int8)

Alternatively, you could convert the string to a list of integers first:

>>> np.array(map(int, "100110"))
array([1, 0, 0, 1, 1, 0])

Edit: I did some quick timing and the first method is over 100x faster than converting it to a list first.

1 Comment

I would strongly recommend using ord('0') instead of 48. Better explicit than implicit.
14

Adding to above answers, numpy now gives a deprecation warning when you use fromstring
DeprecationWarning: The binary mode of fromstring is deprecated, as it behaves surprisingly on unicode inputs. Use frombuffer instead.
A better option is to use the fromiter. It performs twice as fast. This is what I got in jupyter notebook -

import numpy as np
mystr = "100110"

np.fromiter(mystr, dtype=int)
>> array([1, 0, 0, 1, 1, 0])

# Time comparison
%timeit np.array(list(mystr), dtype=int)
>> 3.5 µs ± 627 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit np.fromstring(mystr, np.int8) - 48
>> 3.52 µs ± 508 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)

%timeit np.fromiter(mystr, dtype=int)
1.75 µs ± 133 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.