How to construct pandas dataframe from series of arrays

Question

Hi I have the following pandas Series of numpy arrays:

 datetime
    03-Sep-15     [53.5688348969, 31.2542494769, 18.002043765]
    04-Sep-15     [46.845084292, 27.0833015735, 15.5997887379]
    08-Sep-15    [52.8701581666, 30.7347431703, 17.6379377917]
    09-Sep-15    [47.9535624339, 27.7063099999, 15.9126963643]
    10-Sep-15     [51.2900606534, 29.600945626, 16.8756260105]

Do you know how I could convert it into a dataframe with 3 columns? Thanks!

let me get back to this. Because I just realized the array has some NaNs that are treated as single rows. Don't spend any time on it yet. — NickD1
– NickD1, Commented Sep 15, 2015 at 19:43

jpp · Accepted Answer · 2018-07-11 10:22:06Z

19

Feeding a list of lists to pd.DataFrame is a more efficient approach:

s = pd.Series([np.array([53.5688348969, 31.2542494769, 18.002043765]),
               np.array([46.845084292, 27.0833015735, 15.5997887379]),
               np.array([52.8701581666, 30.7347431703, 17.6379377917]),
               np.array([47.9535624339, 27.7063099999, 15.9126963643]),
               np.array([51.2900606534, 29.600945626, 16.8756260105])],
              index=['03-Sep-15', '04-Sep-15', '08-Sep-15', '09-Sep-15', '10-Sep-15'])

df = pd.DataFrame(s.values.tolist(), index=s.index)

print(df)

                   0          1          2
03-Sep-15  53.568835  31.254249  18.002044
04-Sep-15  46.845084  27.083302  15.599789
08-Sep-15  52.870158  30.734743  17.637938
09-Sep-15  47.953562  27.706310  15.912696
10-Sep-15  51.290061  29.600946  16.875626

Benchmarking on Python 3.6 / Pandas 0.19:

%timeit pd.DataFrame(s.values.tolist(), index=s.index)  # 448 µs per loop
%timeit s.apply(pd.Series)                              # 1.5 ms per loop

answered Jul 11, 2018 at 10:22

jpp

166k37 gold badges301 silver badges362 bronze badges

Sign up to request clarification or add additional context in comments.

1 Comment

mrduhart Over a year ago

Not in my case at least: 2.88 s ± 40.9 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

<magic-timeit>:1: FutureWarning: Returning a DataFrame from Series.apply when the supplied function returns a Series is deprecated and will be removed in a future version.

851 ms ± 5.45 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

DSM · Accepted Answer · 2015-09-15 19:36:50Z

It won't be super-performant, but you should be able to apply(pd.Series):

>>> ser
03-Sep-15     [53.5688348969, 31.2542494769, 18.002043765]
04-Sep-15     [46.845084292, 27.0833015735, 15.5997887379]
08-Sep-15    [52.8701581666, 30.7347431703, 17.6379377917]
09-Sep-15    [47.9535624339, 27.7063099999, 15.9126963643]
10-Sep-15     [51.2900606534, 29.600945626, 16.8756260105]
dtype: object
>>> type(ser.values[0])
<class 'numpy.ndarray'>
>>> ser.apply(pd.Series)
                   0          1          2
03-Sep-15  53.568835  31.254249  18.002044
04-Sep-15  46.845084  27.083302  15.599789
08-Sep-15  52.870158  30.734743  17.637938
09-Sep-15  47.953562  27.706310  15.912696
10-Sep-15  51.290061  29.600946  16.875626

Bill · Accepted Answer · 2024-06-25 16:56:44Z

0

You can also do this:

df = pd.DataFrame(np.stack(s.tolist()), index=s.index)

I believe it's a little faster than pd.DataFrame(s.values.tolist(), index=s.index).

answered Jun 25, 2024 at 16:56

Bill

11.8k13 gold badges68 silver badges100 bronze badges

Collectives™ on Stack Overflow

How to construct pandas dataframe from series of arrays

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

3 Answers 3

1 Comment

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related