14

Suppose we start with

import numpy as np
a = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])

How can this be efficiently be made into a pandas DataFrame equivalent to

import pandas as pd
>>> pd.DataFrame({'a': [0, 0, 1, 1], 'b': [1, 3, 5, 7], 'c': [2, 4, 6, 8]})

   a  b  c
0  0  1  2
1  0  3  4
2  1  5  6
3  1  7  8

The idea is to have the a column have the index in the first dimension in the original array, and the rest of the columns be a vertical concatenation of the 2d arrays in the latter two dimensions in the original array.

(This is easy to do with loops; the question is how to do it without them.)


Longer Example

Using @Divakar's excellent suggestion:

>>> np.random.randint(0,9,(4,3,2))
array([[[0, 6],
    [6, 4],
    [3, 4]],

   [[5, 1],
    [1, 3],
    [6, 4]],

   [[8, 0],
    [2, 3],
    [3, 1]],

   [[2, 2],
    [0, 0],
    [6, 3]]])

Should be made to something like:

>>> pd.DataFrame({
    'a': [0, 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3], 
    'b': [0, 6, 3, 5, 1, 6, 8, 2, 3, 2, 0, 6], 
    'c': [6, 4, 4, 1, 3, 4, 0, 3, 1, 2, 0, 3]})
    a  b  c
0   0  0  6
1   0  6  4
2   0  3  4
3   1  5  1
4   1  1  3
5   1  6  4
6   2  8  0
7   2  2  3
8   2  3  1
9   3  2  2
10  3  0  0
11  3  6  3
2
  • Shouldn't we have 'b': [1, 3, 5, 7] for that sample? Also, could you add another sample, like a = np.random.randint(0,9,(4,3,2)), just to see what to expect when the dimensions have different lengths? Commented Mar 26, 2016 at 12:26
  • @Divakar Thanks for the excellent comment! Commented Mar 26, 2016 at 12:45

3 Answers 3

27

Here's one approach that does most of the processing on NumPy before finally putting it out as a DataFrame, like so -

m,n,r = a.shape
out_arr = np.column_stack((np.repeat(np.arange(m),n),a.reshape(m*n,-1)))
out_df = pd.DataFrame(out_arr)

If you precisely know that the number of columns would be 2, such that we would have b and c as the last two columns and a as the first one, you can add column names like so -

out_df = pd.DataFrame(out_arr,columns=['a', 'b', 'c'])

Sample run -

>>> a
array([[[2, 0],
        [1, 7],
        [3, 8]],

       [[5, 0],
        [0, 7],
        [8, 0]],

       [[2, 5],
        [8, 2],
        [1, 2]],

       [[5, 3],
        [1, 6],
        [3, 2]]])
>>> out_df
    a  b  c
0   0  2  0
1   0  1  7
2   0  3  8
3   1  5  0
4   1  0  7
5   1  8  0
6   2  2  5
7   2  8  2
8   2  1  2
9   3  5  3
10  3  1  6
11  3  3  2
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks! This worked quite nicely. Although, I replaced m,n,r with x,y,z.
The best solution that I've found to pass 3d array to pandas dataFrame!!
Since the Panel Object was just removed in pandas v0.25.0 this should probably become the canonical answer.
5

Using Panel:

a = np.array([[[1, 2], [3, 4]], [[5, 6], [7, 8]]])
b=pd.Panel(rollaxis(a,2)).to_frame()
c=b.set_index(b.index.labels[0]).reset_index()
c.columns=list('abc')

then a is :

[[[1 2]
  [3 4]]

 [[5 6]
  [7 8]]]

b is :

             0  1
major minor      
0     0      1  2
      1      3  4
1     0      5  6
      1      7  8

and c is :

   a  b  c
0  0  1  2
1  0  3  4
2  1  5  6
3  1  7  8

1 Comment

Panel has been deprecated, see answer below by @Divakar
1

Here's a pure-Pandas solution without Panels.

To get a dataframe with MultiIndex, use pd.concat:

>>> df = pd.concat([pd.DataFrame(arr) for arr in a], keys=np.arange(len(a)))
>>> df
     0  1
0 0  0  6
  1  6  4
  2  3  4
1 0  5  1
  1  1  3
  2  6  4
2 0  8  0
  1  2  3
  2  3  1
3 0  2  2
  1  0  0
  2  6  3

To convert it to the non-MultiIndex form provided in the question:

>>> df.reset_index().drop('level_1',axis=1).set_axis(['a','b','c'], axis=1)

    a  b  c
0   0  0  6
1   0  6  4
2   0  3  4
3   1  5  1
4   1  1  3
5   1  6  4
6   2  8  0
7   2  2  3
8   2  3  1
9   3  2  2
10  3  0  0
11  3  6  3

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.