2

The code

np.array([100,200,300],dtype=str)

returns:

array(['1', '2', '3'], 
      dtype='|S1')

The documentation says:

dtype : data-type, optional

The desired data-type for the array. If not given, then the type will be determined as the minimum type required to hold the objects in the sequence.

Is this a bug?

7
  • 1
    Can you try using dtype='|S3' and see if that gives what you expect? Commented Aug 29, 2013 at 18:39
  • This question has been asked fairly recently, although I cannot find it right off. A detailed search of the numpy tag should lead you to it. Commented Aug 29, 2013 at 18:39
  • @SethMMorton Using '|S3' works Commented Aug 29, 2013 at 18:40
  • What happens if you have np.array([101,201,301],dtype=str) instead? Commented Aug 29, 2013 at 18:42
  • @SethMMorton Still returns the "erroneous" results. Commented Aug 29, 2013 at 18:49

2 Answers 2

2

I still cannot find the question, but to get around it:

>>> a=[100,200,300]

>>> np.char.mod('%d', a)
array(['100', '200', '300'],
      dtype='|S3')

This circumvents your problem:

>>> a=[100,200,3005]
>>> np.char.mod('%d', a)
array(['100', '200', '3005'],
      dtype='|S4')

The obscure documentation, it should be noted that this is roughly 4 times slower then choosing dtype="S..", but non-linearly faster then using np.array(map(str,a)) methods.

You can also do some neat things:

>>> a
[1234.5, 123.4, 12345]

>>> np.char.mod('%s',a)
array(['1234.5', '123.4', '12345.0'],
      dtype='|S7')

>>> np.char.mod('%f',a)
array(['1234.500000', '123.400000', '12345.000000'],
      dtype='|S12')

>>> np.char.mod('%d',a) #Note the truncation of decimals here.
array(['1234', '123', '12345'],
      dtype='|S5')

>>> np.char.mod('%s.stuff',a)
array(['1234.5.stuff', '123.4.stuff', '12345.0.stuff'],
      dtype='|S13')

Additional information can be found here.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks! Can you add a link to the documentation for this function?
I also updated this with a few extra examples, depending on what you are doing %d might not be optimal for you.
1

The reason you see this behavior is that you have to specify the size of each string element e.g. using:

>>> np.array([100,200,300],dtype='S3')
      array(['100', '200', '300'], 
             dtype='|S3')

Otherwise the size of each element string will default to 1.

More info here: Numpy converting array from float to strings

1 Comment

The problem is that to do this for an arbitrary list of numbers I need to check the lengths of all the numbers first.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.