0

I want to produce a 3-dimensional numpy.ndarray from a multi-indexed pandas.DataFrame. More precisely, say I have:

df = pd.DataFrame([[1, '1' , 1, 10], [1, '2', 2, 20], [2, '1', 5, 30]], columns=['x', 'y', 'z', 't'])
df = df.set_index(['x','y'])
df

which gives me

        z  t
  x  y
  1  1  1 10
     2  2 20
  2  1  5 30

and I want to write a function which returns, with the above argument, the numpy.ndarray

  [[[1, 10],
    [2, 20]],
   [[5, 30],
    [NaN, NaN]]]

Pandas multi-index looks like a substitute for multidimensional arrays, but it does not provide (or at least does not document) ways to go back and forth...

Thanks.

2
  • df.unstack().stack(dropna=False).to_numpy() Commented Jan 27, 2020 at 17:31
  • Thanks, but this does not work: this returns a 2-dimensional array. Commented Jan 27, 2020 at 17:58

1 Answer 1

1

Use:

df.to_xarray().to_array().values.transpose(1,2,0)

>>[[[ 1. 10.]
  [ 2. 20.]]

 [[ 5. 30.]
  [nan nan]]]
Sign up to request clarification or add additional context in comments.

3 Comments

Thanks! So xarray is yet another library on top of numpy and pandas, which has to patch the deficiencies of pandas? PS: I'll upvote your answer when I have enough reputation.
yes, it's a different data structure. more info here
note the transpose is just to get the values in the order you specified.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.