11

I have a 3d array as follows:

    ThreeD_Arrays = np.random.randint(0, 1000, (5, 4, 3))
    
    array([[[715, 226, 632],
            [305,  97, 534],
            [ 88, 592, 902],
            [172, 932, 263]],

           [[895, 837, 431],
            [649, 717,  39],
            [363, 121, 274],
            [334, 359, 816]],

           [[520, 692, 230],
            [452, 816, 887],
            [688, 509, 770],
            [290, 856, 584]],

           [[286, 358, 462],
            [831,  26, 332],
            [424, 178, 642],
            [955,  42, 938]], 

           [[ 44, 119, 757],
            [908, 937, 728],
            [809,  28, 442],
            [832, 220, 348]]])

Now I would like to have it into a DataFrame like this:

the expected results is shown here

Add a Date column like indicated and the column names A, B, C.

How to do this transformation? Thanks!

3 Answers 3

8

Based on the answer to this question, we can use a MultiIndex. First, create the MultiIndex and a flattened DataFrame.

A = np.random.randint(0, 1000, (5, 4, 3))

names = ['x', 'y', 'z']
index = pd.MultiIndex.from_product([range(s)for s in A.shape], names=names)
df = pd.DataFrame({'A': A.flatten()}, index=index)['A']

Now we can reshape it however we like:

df = df.unstack(level='x').swaplevel().sort_index()
df.columns = ['A', 'B', 'C']
df.index.names = ['DATE', 'i']

This is the result:

          A    B    C
DATE i           
0    0  715  226  632
     1  895  837  431
     2  520  692  230
     3  286  358  462
     4   44  119  757
1    0  305   97  534
     1  649  717   39
     2  452  816  887
     3  831   26  332
     4  908  937  728
2    0   88  592  902
     1  363  121  274
     2  688  509  770
     3  424  178  642
     4  809   28  442
3    0  172  932  263
     1  334  359  816
     2  290  856  584
     3  955   42  938
     4  832  220  348
Sign up to request clarification or add additional context in comments.

3 Comments

The first part of this answer is the nicest way I believe to convert a multidimensional Numpy array to a Pandas dataframe.
The pandas panel has been depreciated, so this should now be the accepted answer.
Why do you need pd.DataFrame({'A': A.flatten()}, index=index)['A'] instead of simply pd.DataFrame(A.flatten(), index=index)?
4

You could convert your 3D array to a Pandas Panel, then flatten it to a 2D DataFrame (using .to_frame()):

import numpy as np
import pandas as pd
np.random.seed(2016)

arr = np.random.randint(0, 1000, (5, 4, 3))
pan = pd.Panel(arr)
df = pan.swapaxes(0, 2).to_frame()
df.index = df.index.droplevel('minor')
df.index.name = 'Date'
df.index = df.index+1
df.columns = list('ABC')

yields

        A    B    C
Date               
1     875  702  266
1     940  180  971
1     254  649  353
1     824  677  745
...
4     675  488  939
4     382  238  225
4     923  926  633
4     664  639  616
4     770  274  378

Alternatively, you could reshape the array to shape (20, 3), form the DataFrame as usual, and then fix the index:

import numpy as np
import pandas as pd
np.random.seed(2016)

arr = np.random.randint(0, 1000, (5, 4, 3))
df = pd.DataFrame(arr.reshape(-1, 3), columns=list('ABC'))
df.index = np.repeat(np.arange(arr.shape[0]), arr.shape[1]) + 1
df.index.name = 'Date'
print(df)

yields the same result.

1 Comment

Many thanks! unutbu! it works! however I would like to plot the data in a seaborn like disgram, and need to have column name more complicated than 'ABC', say it should be "Abc", "Def", "Ghi" how it works? And the seaborn like disgram I like to have is impletemented as follows: df = pd.melt(df, id_vars=["date"], var_name="condition") ax = df.groupby(["condition", "date"]).mean().unstack("condition").plot() x = np.arange(1990,1993) palette = sns.color_palette() Sources: stackoverflow.com/questions/22795348/…
3
ThreeD_Arrays = np.random.randint(0, 1000, (5, 4, 3))
df = pd.DataFrame([list(l) for l in ThreeD_Arrays]).stack().apply(pd.Series).reset_index(1, drop=True)
df.index.name = 'Date'
df.columns = list('ABC')

1 Comment

Thanks @Jarad! my new question is how to make the resulted df from your code to fit the following code: df = pd.melt(df, id_vars=["date"], var_name="condition") ax = df.groupby(["condition", "date"]).mean().unstack("condition").plot() and how to replace list('ABC') with column name "AAA", "BBB", "CCC"?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.