0

I have a pandas data frame including some columns, let say 'a', 'b' and 'c', containing numpy arrays.

I would like to concatenate the np arrays from different columns obtaining a single np array for each row.

Is there an efficient way to do this avoiding iteration?

1
  • Each cell array is separate. Look for example at df['a'].to_numpy(). I expect it is a 1d object dtype array. If the subarrays are all the same size, they can be stack into one 2d array. But if different, you are stuck with working on separate arrays. This needs a minimal reproducible example Commented Jun 17, 2022 at 18:09

1 Answer 1

1

You can concat two NumPy arrays with np.concatenate function.

import pandas as pd
import numpy as np
df = pd.DataFrame()
df['x'] = [np.array([1]), np.array([1, 2]), np.array([1, 2, 3])]
df['y'] = [np.array([1]), np.array([1, 2]), np.array([1, 2, 3])]
df['concat'] = df[['x', 'y']].apply(lambda x:  np.concatenate((x[0], x[1])), axis=1)

df

           x          y              concat
0        [1]        [1]              [1, 1]
1     [1, 2]     [1, 2]        [1, 2, 1, 2]
2  [1, 2, 3]  [1, 2, 3]  [1, 2, 3, 1, 2, 3]
Sign up to request clarification or add additional context in comments.

1 Comment

apply is a row iteration, but since the cell arrays are all different, I don't think there's a way around this.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.