Pandas DataFrame to List of Lists

Question

It's easy to turn a list of lists into a pandas dataframe:

import pandas as pd
df = pd.DataFrame([[1,2,3],[3,4,5]])

But how do I turn df back into a list of lists?

lol = df.what_to_do_now?
print lol
# [[1,2,3],[3,4,5]]

pd.DataFrame.what_to_do_now = lambda self: self.values.tolist(); lol = df.what_to_do_now(); print(lol) # [[1,2,3],[3,4,5]] it works if you can believe it. — cs95
– cs95, Commented Dec 27, 2020 at 2:17

DSM · Accepted Answer · 2015-01-18 03:18:39Z

298

You could access the underlying array and call its tolist method:

>>> df = pd.DataFrame([[1,2,3],[3,4,5]])
>>> lol = df.values.tolist()
>>> lol
[[1L, 2L, 3L], [3L, 4L, 5L]]

answered Jan 18, 2015 at 3:18

DSM

355k67 gold badges606 silver badges504 bronze badges

Sign up to request clarification or add additional context in comments.

8 Comments

user48956 Over a year ago

L mean long, as opposed to int.

cs95 Over a year ago

From v0.24 onwards, it would be better to use df.to_numpy().tolist().

Russell Lego Over a year ago

NOTE, this does not preserve the column ordering. so watch out for that

Yohan Obadia Over a year ago

There is no reason why it would not preserve the column ordering.

AMC Over a year ago

@RussellLego That seems a bit odd, do you happen to know of an example which could demonstrate that?

|

Andrew E · Accepted Answer · 2017-11-14 05:24:10Z

29

If the data has column and index labels that you want to preserve, there are a few options.

Example data:

>>> df = pd.DataFrame([[1,2,3],[3,4,5]], \
       columns=('first', 'second', 'third'), \
       index=('alpha', 'beta')) 
>>> df
       first  second  third
alpha      1       2      3
beta       3       4      5

The tolist() method described in other answers is useful but yields only the core data - which may not be enough, depending on your needs.

>>> df.values.tolist()
[[1, 2, 3], [3, 4, 5]]

One approach is to convert the DataFrame to json using df.to_json() and then parse it again. This is cumbersome but does have some advantages, because the to_json() method has some useful options.

>>> df.to_json()
{
  "first":{"alpha":1,"beta":3},
  "second":{"alpha":2,"beta":4},"third":{"alpha":3,"beta":5}
}

>>> df.to_json(orient='split')
{
 "columns":["first","second","third"],
 "index":["alpha","beta"],
 "data":[[1,2,3],[3,4,5]]
}

Cumbersome but may be useful.

The good news is that it's pretty straightforward to build lists for the columns and rows:

>>> columns = [df.index.name] + [i for i in df.columns]
>>> rows = [[i for i in row] for row in df.itertuples()]

This yields:

>>> print(f"columns: {columns}\nrows: {rows}") 
columns: [None, 'first', 'second', 'third']
rows: [['alpha', 1, 2, 3], ['beta', 3, 4, 5]]

If the None as the name of the index is bothersome, rename it:

df = df.rename_axis('stage')

Then:

>>> columns = [df.index.name] + [i for i in df.columns]
>>> print(f"columns: {columns}\nrows: {rows}") 

columns: ['stage', 'first', 'second', 'third']
rows: [['alpha', 1, 2, 3], ['beta', 3, 4, 5]]

answered Nov 14, 2017 at 5:24

Andrew E

8,4654 gold badges47 silver badges47 bronze badges

4 Comments

Konstantin Over a year ago

If you have a multilevel index, the index tuple will be the first element of the generated rows. You'll need a further step to split it.

AMC Over a year ago

Wouldn't it be simpler to use DataFrame.itertuples() or DataFrame.to_records() for all this?

Andrew E Over a year ago

@AMC Perhaps, I don't know, maybe? Rather than pontificate, why not add a proper treatment of that thought in your own answer?

AMC Over a year ago

@AndrewE Eh, it's still worth discussing and improving upon existing answers.

neves · Accepted Answer · 2020-06-30 13:34:14Z

10

I wanted to preserve the index, so I adapted the original answer to this solution:

list_df = df.reset_index().values.tolist()

Now you can paste it somewhere else (e.g. to paste into a Stack Overflow question) and latter recreate it:

pd.Dataframe(list_df, columns=['name1', ...])
pd.set_index(['name1'], inplace=True)

edited Jun 30, 2020 at 13:34

answered Oct 3, 2018 at 14:31

neves

40.3k33 gold badges189 silver badges227 bronze badges

Comments

aps · Accepted Answer · 2015-01-18 03:59:46Z

7

I don't know if it will fit your needs, but you can also do:

>>> lol = df.values
>>> lol
array([[1, 2, 3],
       [3, 4, 5]])

This is just a numpy array from the ndarray module, which lets you do all the usual numpy array things.

answered Jan 18, 2015 at 3:59

aps

911 silver badge4 bronze badges

1 Comment

jpp Over a year ago

Plus 1. In practice, there's often no need to convert the NumPy array into a list of lists.

Jondalar · Accepted Answer · 2021-04-17 14:58:30Z

6

I had this problem: how do I get the headers of the df to be in row 0 for writing them to row 1 in the excel (using xlsxwriter)? None of the proposed solutions worked, but they pointed me in the right direction. I just needed one line more of code

# get csv data
df = pd.read_csv(filename)

# combine column headers and list of lists of values
lol = [df.columns.tolist()] + df.values.tolist()

answered Apr 17, 2021 at 14:58

Jondalar

711 silver badge4 bronze badges

Comments

Zoe - Save the data dump · Accepted Answer · 2019-01-15 15:30:43Z

2

Maybe something changed but this gave back a list of ndarrays which did what I needed.

list(df.values)

edited Jan 15, 2019 at 15:30

Zoe - Save the data dump

28.4k22 gold badges130 silver badges163 bronze badges

answered Jan 15, 2019 at 7:48

Ian Rubenstein

1116 bronze badges

Comments

Thorsten · Accepted Answer · 2020-12-10 18:18:00Z

2

The solutions presented so far suffer from a "reinventing the wheel" approach. Quoting @AMC:

If you're new to the library, consider double-checking whether the functionality you need is already offered by those Pandas objects.

If you convert a dataframe to a list of lists you will lose information - namely the index and columns names.

My solution: use to_dict()

dict_of_lists = df.to_dict(orient='split')

This will give you a dictionary with three lists: index, columns, data. If you decide you really don't need the columns and index names, you get the data with

dict_of_lists['data']

answered Dec 10, 2020 at 18:18

Thorsten

3265 silver badges10 bronze badges

1 Comment

Thorsten Over a year ago

The solution presented above is still "lossy". You will lose the name of the index and columns (df.index.name and df.columns.name)

RAMA KRISHNA · Accepted Answer · 2021-06-18 21:30:04Z

2

Not quite relate to the issue but another flavor with same expectation

converting data frame series into list of lists to plot the chart using create_distplot in Plotly

    hist_data=[]
    hist_data.append(map_data['Population'].to_numpy().tolist())

answered Jun 18, 2021 at 21:30

RAMA KRISHNA

516 bronze badges

Comments

e1i45 · Accepted Answer · 2020-08-19 07:56:59Z

1

"df.values" returns a numpy array. This does not preserve the data types. An integer might be converted to a float.

df.iterrows() returns a series which also does not guarantee to preserve the data types. See: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.iterrows.html

The code below converts to a list of list and preserves the data types:

rows = [list(row) for row in df.itertuples()]

answered Aug 19, 2020 at 7:56

e1i45

1,5991 gold badge16 silver badges20 bronze badges

Comments

Timothy C. Quinn · Accepted Answer · 2020-12-04 19:57:30Z

1

If you wish to convert a Pandas DataFrame to a table (list of lists) and include the header column this should work:

import pandas as pd
def dfToTable(df:pd.DataFrame) -> list:
    return [list(df.columns)] + df.values.tolist()

Usage (in REPL):

>>> df = pd.DataFrame(
             [["r1c1","r1c2","r1c3"],["r2c1","r2c2","r3c3"]]
             , columns=["c1", "c2", "c3"])
>>> df
     c1    c2    c3
0  r1c1  r1c2  r1c3
1  r2c1  r2c2  r3c3
>>> dfToTable(df)
[['c1', 'c2', 'c3'], ['r1c1', 'r1c2', 'r1c3'], ['r2c1', 'r2c2', 'r3c3']]

answered Dec 4, 2020 at 19:57

Timothy C. Quinn

4,6391 gold badge47 silver badges56 bronze badges

Comments

Tms91 · Accepted Answer · 2020-01-23 10:15:26Z

0

This is very simple:

import numpy as np

list_of_lists = np.array(df)

answered Jan 23, 2020 at 10:15

Tms91

4,39910 gold badges56 silver badges100 bronze badges

1 Comment

AMC Over a year ago

How is this different from using DataFrame.values or DataFrame.to_numpy() ? Never mind the fact that it creates a NumPy array, not a plain Python list.

xjcl · Accepted Answer · 2023-01-02 14:49:08Z

0

A function I wrote that allows including the index column or the header row:

def df_to_list_of_lists(df, index=False, header=False):
    rows = []
    if header:
        rows.append(([df.index.name] if index else []) + [e for e in df.columns])
    for row in df.itertuples():
        rows.append([e for e in row] if index else [e for e in row][1:])
    return rows

answered Jan 2, 2023 at 14:49

xjcl

15.7k8 gold badges87 silver badges108 bronze badges

Comments

Ram Prajapati · Accepted Answer · 2020-01-09 10:57:01Z

-1

We can use the DataFrame.iterrows() function to iterate over each of the rows of the given Dataframe and construct a list out of the data of each row:

# Empty list 
row_list =[] 

# Iterate over each row 
for index, rows in df.iterrows(): 
    # Create list for the current row 
    my_list =[rows.Date, rows.Event, rows.Cost] 

    # append the list to the final list 
    row_list.append(my_list) 

# Print 
print(row_list)

We can successfully extract each row of the given data frame into a list

answered Jan 9, 2020 at 10:57

Ram Prajapati

2,1111 gold badge13 silver badges8 bronze badges

1 Comment

Derek O Over a year ago

This is not a good idea, try to avoid using df.iterrows because it's anti-pattern and slow once the df gets large: stackoverflow.com/questions/16476924/…

AMC · Accepted Answer · 2020-05-29 00:08:46Z

-1

Note: I have seen many cases on Stack Overflow where converting a Pandas Series or DataFrame to a NumPy array or plain Python lists is entirely unecessary. If you're new to the library, consider double-checking whether the functionality you need is already offered by those Pandas objects.

To quote a comment by @jpp:

In practice, there's often no need to convert the NumPy array into a list of lists.

If a Pandas DataFrame/Series won't work, you can use the built-in DataFrame.to_numpy and Series.to_numpy methods.

edited May 29, 2020 at 0:08

answered Jan 7, 2020 at 17:40

AMC

2,6977 gold badges15 silver badges35 bronze badges

3 Comments

Nicolas Gervais Over a year ago

This answer represents little more than your own beliefs. And quite frankly, it's a little embarrassing. There are perfectly valid reasons to convert a dataframe to a list/array, an advanced user would certainly know.

AMC Over a year ago

@NicolasGervais It might be a bit too much, yes, I'll edit it to generalize less. There are perfectly valid reasons to convert a dataframe to a list/array Of course, my answer doesn't really say anything to the contrary. an advanced user would certainly know. I don't see the point of that jab. I wrote this answer after noticing that many people were converting series to ndarrays or lists, and ndarrays to lists, simply because they were unaware of what operations those objects support.

AMC Over a year ago

I'm referring to very blatant cases, like doing for elem in some_series.values.tolist(): because they don't know that you can iterate over the elements of a series. I'm not sure what's so awful about this answer.

Collectives™ on Stack Overflow

Pandas DataFrame to List of Lists

14 Answers 14

8 Comments

4 Comments

Comments

1 Comment

Comments

Comments

1 Comment

Comments

Comments

Comments

1 Comment

Comments

1 Comment

3 Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

14 Answers 14

8 Comments

4 Comments

Comments

1 Comment

Comments

Comments

1 Comment

Comments

Comments

Comments

1 Comment

Comments

1 Comment

3 Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related