Convert DataFrame to JSON Arrays with Python

Question

I have this DataFrame:

             A       B        C        D
16         USA    15.5     10.8      4.7
15     Germany    17.7     12.3      5.3

I would like to create a JSON file that looks like so:

"data": [
   ["A", "B", "C", "D"],
   ["USA", "15.5", "10.8", "4.7"],
   ["Germany", "17.7", "12.3", "5.3],
]

to_dict() or to_json() do not seem to work.

This does work:

numpy_array = df.to_numpy()
column_headers = df.columns.values.tolist()
array_with_headers = numpy.vstack([column_headers, numpy_array])
json_object = {}
json_object["data"] = array_with_headers.tolist()

...is kind of complicated though. Any idea how I can achieve this without numpy? Thanks a lot!

ThePyGuy · Accepted Answer · 2021-03-29 17:01:49Z

1

You can convert each row to list using apply and then call to_list(), that will give you data in list, on which you can call json.dumps

import json
json.dumps({'data': df.reset_index(drop=True).T.reset_index().T.apply(list,axis=1).to_list()})

Output:

'{"data": [["A", "B", "C", "D"], ["USA", 15.5, 10.8, 4.7], ["Germany", 17.7, 12.3, 5.3]]}'

PS: reset_index() and trasnpose T has been used just to bring column name in the first row.

answered Mar 29, 2021 at 17:01

ThePyGuy

18.5k5 gold badges24 silver badges55 bronze badges

Sign up to request clarification or add additional context in comments.

2 Comments

trashjazz Over a year ago

Awesome - thank you! Can you easily prettify the JSON with json.dumps (i.e. remove backslashes, line breaks with indent=1 instead of \n)?

ThePyGuy Over a year ago

That becomes a new question, you can ask on SO, someone will definitely answer it

Collectives™ on Stack Overflow

Convert DataFrame to JSON Arrays with Python

1 Answer 1

2 Comments

Your Answer

Hot Network Questions

Collectives™ on Stack Overflow

1 Answer 1

2 Comments

Your Answer

Sign up or log in

Post as a guest

Related