5

I want to create a Data Frame from a dictionary where the values are 2D numpy array.

my_Dict={'a': array([[1, 2, 3],[4, 5, 6]]), 'b': array([[7,8,9],[10,11,12]]),'c': array([[13,14,15],[16,17,18]])}

I expect the outcome to be a Data frame with 2 rows( number of rows in numpy array)and 3 column as below:

       a         b          c

0  [1, 2, 3]   [7,8,9]    [13,14,15]

1  [4, 5, 6]  [10,11,12] [16,17,18]

I tried changing the values to list and it worked. but I want to keep the values as np array for applying numby functions to the values.

3
  • Just wondering, would all values in a column be of same length? (Because if yes, you'll be a lot better off saving them as 3 columns instead of 1, and still be able to use all numpy operations on the underlying arrays) Commented Apr 28, 2019 at 13:36
  • Thanks for your comment. I want to merge this DataFrame later with another one and the columns represent the values of different attributes of some outcomes. thats why it is important for me that each column refers to a single attribute. Commented Apr 28, 2019 at 13:56
  • in that case, let me write up a suggestion for you to use here. Commented Apr 28, 2019 at 13:58

2 Answers 2

2
>>> list(np.array([[1, 2, 3],[4, 5, 6]]))
[array([1, 2, 3]), array([4, 5, 6])]
>>>

Transform each column's 2-d array into a list of two 1-d arrays

d = {'a': np.array([[1, 2, 3],[4, 5, 6]]),
      'b': np.array([[7,8,9],[10,11,12]]),
      'c': np.array([[13,14,15],[16,17,18]])}

df = pd.DataFrame({k:list(v) for k,v in d.items()})

>>> df
           a             b             c
0  [1, 2, 3]     [7, 8, 9]  [13, 14, 15]
1  [4, 5, 6]  [10, 11, 12]  [16, 17, 18]
>>> 

>>> df.loc[0,'a']
array([1, 2, 3])
>>> df['a'].values
array([array([1, 2, 3]), array([4, 5, 6])], dtype=object)
>>> df.values
array([[array([1, 2, 3]), array([7, 8, 9]), array([13, 14, 15])],
       [array([4, 5, 6]), array([10, 11, 12]), array([16, 17, 18])]],
      dtype=object)
>>>
Sign up to request clarification or add additional context in comments.

Comments

2

Perhaps, tackling into why you'd want to do this, i would instead recommend making a multilevel dataframe.

Given:

import numpy as np
myDict = {'a': np.array([[1, 2, 3],[4, 5, 6]]),
          'b': np.array([[7,8,9],[10,11,12]]),
          'c': np.array([[13,14,15],[16,17,18]])}

Turn each array into an individual dataframe, and concat to get a 2 level df.

df = pd.concat([pd.DataFrame(v) for k, v in myDict.items()], axis = 1, keys = list(myDict.keys()))

print(df)
   a         b           c        
   0  1  2   0   1   2   0   1   2
0  1  2  3   7   8   9  13  14  15
1  4  5  6  10  11  12  16  17  18

This allows the internal structures of the dataframe to be numpy arrays instead of dealing with objects. (This helps with the speed of some operations, instead of always resorting to iteration during operations on the column with a datatype of object.)

You can index normally still:

print(df['a'])
   0  1  2
0  1  2  3
1  4  5  6

And also do operations on the underlying numpy arrays, either directly or using .values

df['a'] = df['a'].values * 10

print(df)
    a           b           c        
    0   1   2   0   1   2   0   1   2
0  10  20  30   7   8   9  13  14  15
1  40  50  60  10  11  12  16  17  18

6 Comments

...internal structures of the dataframe to be numpy arrays instead of dealing with objects. - why is that advantageous?
am i mistaken in saying that?
?? I was asking - I don't use MultiIndexed/heirarchical DataFrames/Series and don't have a good understanding. Intuitively I think there is an advantage over my solution that produces a DataFrame of objects.
Ok yep, coldspeed confirmed it here Partial Quote: "and all operations on objects fall back to a slow, loopy implementation."
That was Good - thnx coldspeed :). Even without knowing what operations the OP will be doing, I suspect that at best my solution would have the same performance and most probably would be worse than operations with your solution.
|

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.