0

I have a dataframe that I would like to convert to json format by selecting the columns. And since I have a lot of lines, I can't do everything by hand

I have a dataframe that looks this :

 Cars = {'Brand': ['Honda Civic','Toyota Corolla','Ford Focus','Audi A4', np.nan, 'Ford', 'Audi A1'],
    'Price': [22000,25000,27000,35000, 29000, 27000, 35000],
    'Liscence Plate': ['ABC 123', 'XYZ 789', 'CBA 321', 'ZYX 987', 'DEF 456', 'DEF 466', 'ABC 123']}

df = pd.DataFrame(Cars,columns= ['Brand', 'Price', 'Liscence Plate'])


            Brand  Price Liscence Plate
0  Honda Civic     22000  ABC 123
1  Toyota Corolla  25000  XYZ 789
2  Ford Focus      27000  CBA 321
3  Audi A4         35000  ZYX 987
4  NaN             29000  DEF 456
5  Ford            27000  DEF 466
6  Audi A1         35000  ABC 123

And I have to convert to this :

data = {"form": [
             {"Liscence Plate": "ABC 123",
              "Brand": ["Honda Civic", "Audi A1"
],
              "Price": ["22000", "35000"]},
{"Liscence Plate": "XYZ 789",
              "Brand": ["Toyota Corolla",
],
              "Price": ["25000"]},
{"Liscence Plate": "CBA 321",
              "Brand": ["Ford Focus",
],
              "Price": ["27000"]},
{"Liscence Plate": "ZYX 987",
              "Brand": ["Audi A4",
],
              "Price": ["35000"]},
{"Liscence Plate": "DEF 456",
              "Brand": ["NaN", "Ford"
],
              "Price": ["29000", "27000"]}

3 Answers 3

1

Have a look at the .to_json() function. It will allow you to easily convert a DataFrame to json. You can change the schema of the json by supplying the orient argument.

This will work well enough, but it will not give you lists for the Brand and Price keys. If you want more flexibility, you can first use the .to_dict() function with the same orient argument, do your changes, and then convert to json using json.dump().

Edit: Based on your edit, I think you want to group by the license plate first? In that case you can do:

df.groupby('Liscence Plate').agg(list).reset_index().to_json('records')

to aggregate to lists and convert to json.

Sign up to request clarification or add additional context in comments.

4 Comments

Hello, yes I check these function, but not allow, to convert in format I want, like specific column at specific place
The export should have the same order of columns as your dataframe, so you could reorder your dataframe like explained here before exporting to JSON. Or use the .to_dict() function and then use a for loop to make the changes manually.
I updated my question and expected output
Check the updated answer
0

Using pandas.DataFrame.iterrows and building 'manually' the result.

data = {'form' : [
            {k:[str(s[k])] if t == list else str(s[k])
                for k, t in (("Liscence Plate", str), ("Brand", list), ("Price", list))}
            for _, s in df.iterrows()]
       }
>>> data
{'form': [
    {'Brand': ['Honda Civic'], 'Liscence Plate': 'ABC 123', 'Price': ['22000']},
    {'Brand': ['Toyota Corolla'], 'Liscence Plate': 'XYZ 789', 'Price': ['25000']},
    {'Brand': ['Ford Focus'], 'Liscence Plate': 'CBA 321', 'Price': ['27000']},
    {'Brand': ['Audi A4'], 'Liscence Plate': 'ZYX 987', 'Price': ['35000']},
    {'Brand': ['nan'], 'Liscence Plate': 'DEF 456', 'Price': ['29000']}
    ]
}

Which is quite close to what you are looking for.

1 Comment

Then follow @Jan proposition 1/ group by Liscence plate, 2/export as dict and loop on the dict to reformat. But after grouping you could also reformat inside the dataframe before exporting as a dict : fancier, quicker
0

So you want this?

df.to_json(orient='records')

Outputs:

[{
    "Brand": "Honda Civic",
    "Price": 22000,
    "Liscence Plate": "ABC 123"
}, {
    "Brand": "Toyota Corolla",
    "Price": 25000,
    "Liscence Plate": "XYZ 789"
}, {
    "Brand": "Ford Focus",
    "Price": 27000,
    "Liscence Plate": "CBA 321"
}, {
    "Brand": "Audi A4",
    "Price": 35000,
    "Liscence Plate": "ZYX 987"
}, {
    "Brand": null,
    "Price": 29000,
    "Liscence Plate": "DEF 456"
}]

Edit:

df = df.groupby('Liscence Plate').agg({'Brand': lambda x: list(x), 'Price': lambda x: list(x)}).reset_index()
df.to_json(orient='records')

[{
    "Liscence Plate": "ABC 123",
    "Brand": ["Honda Civic"],
    "Price": [22000]
}, {
    "Liscence Plate": "CBA 321",
    "Brand": ["Ford Focus"],
    "Price": [27000]
}, {
    "Liscence Plate": "DEF 456",
    "Brand": [null, "Ford F-150"],
    "Price": [29000, 33000]
}, {
    "Liscence Plate": "XYZ 789",
    "Brand": ["Toyota Corolla"],
    "Price": [25000]
}, {
    "Liscence Plate": "ZYX 987",
    "Brand": ["Audi A4"],
    "Price": [35000]
}]

Adding a custom key at the start:

'{\"form\": ' + df.to_json(orient='records') + '}'

The \" escapes the " so form is quoted.

{
    "form": [{
        "Liscence Plate": "ABC 123",
        "Brand": ["Honda Civic"],
        "Price": [22000]
    }, {
        "Liscence Plate": "CBA 321",
        "Brand": ["Ford Focus"],
        "Price": [27000]
    }, {
        "Liscence Plate": "DEF 456",
        "Brand": [null, "Ford F-150"],
        "Price": [29000, 33000]
    }, {
        "Liscence Plate": "XYZ 789",
        "Brand": ["Toyota Corolla"],
        "Price": [25000]
    }, {
        "Liscence Plate": "ZYX 987",
        "Brand": ["Audi A4"],
        "Price": [35000]
    }]
}

1 Comment

@Mauris Since "form" is the key for the entire array, just add it on to the beginning. See edits.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.