10

I have a 2D numpy char array (from a NetCDF4 file) which actually represents a list of strings. I want to convert it into a list of strings.

I know I can use join() to concatenate the chars into a string, but I can only find a way to do this one string at a time:

data = np.array([['a','b'],['c','d']])
for row in data[:]:
    print ''.join(row)

But it's very slow. How can I return an array of strings in a single command? Thanks

1
  • 3
    Why are you copying data in your for loop? Commented Jun 11, 2012 at 17:07

3 Answers 3

14

The list comprehension is the most "pythonic" way.

The most "numpythonic" way would be:

>>> data = np.array([['a','b'],['c','d']])
# a 2D view
>>> data.view('S2')
array([['ab'],
       ['cd']], 
      dtype='|S2')
# or maybe a 1D view ...fastest solution:
>>> data.view('S2').ravel()
array(['ab', 'cd'], 
      dtype='|S2')

No looping, no list comprehension, not even a copy. The buffer just sits there unchanged with a different "view" so this is the fastest solution available.

Sign up to request clarification or add additional context in comments.

1 Comment

An important caveat is that the array must be contiguous in memory -- otherwise the view fails. You can ensure this by using data = np.ascontiguousarray(data).
5

Try a list comprehension:

>> s = [''.join(row) for row in data]
>> s
['ab', 'cd']

which is just your for loop rewritten.

2 Comments

@DavidRobinson Hadn't thought of that - very nice.
@AdrianR- don't forget to accept his answer (by clicking on the green checkmark) if it answered your question.
2
[row.tostring() for row in data]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.