0

I have a csv file in the following format:

a b c d e
1 2 3 4 5
9 8 7 6 5

I want to convert this csv file to Nested JSON format, like this:

[{"a": 1,
"Purchase" : {
              "b": 2,
              "c": 3
              "d": 4},
"Sales": {
           "d": 4,
           "e": 5}},
{"a": 9,
"Purchase" : {
              "b": 8,
              "c": 7},
"Sales": {
           "d": 6,
           "e": 5}}]

How can I make this transformation? I can't seem to figure out how to make this transformation in Python. Keep in mind this is only sample table, my real table has multiple columns and thousands on rows, so manual operations are not economical.

Till now I have tried this code:

with open("new_data.csv") as f:
    reader = csv.DictReader(f)
    for r in reader:
        r["purchase"] = {"b": r['b'],
                        "c": r['c'],
                        }

Here I am trying unsuccessfully to add another key value pair of my required dictionary, but not successfully. Same thing I would have done with Sales also but this is just sample.

1
  • 3
    Thanks for sharing your question. Can you also share what have you tried so far? Commented Mar 23, 2022 at 8:05

3 Answers 3

2

A simple way is to add more columns; then use to_json method in pandas:

import pandas as pd
df = pd.read_csv('your_file.csv')
df['Purchase'] = df[['b','c','d']].to_dict('records')
df['Sales'] = df[['d','e']].to_dict('records')
out = df[['a', 'Purchase', 'Sales']].to_json(orient='records', indent=4)

Output:

[
    {
        "a":1,
        "Purchase":{
            "b":2,
            "c":3,
            "d":4
        },
        "Sales":{
            "d":4,
            "e":5
        }
    },
    {
        "a":9,
        "Purchase":{
            "b":8,
            "c":7,
            "d":6
        },
        "Sales":{
            "d":6,
            "e":5
        }
    }
]
Sign up to request clarification or add additional context in comments.

Comments

0

You don't need any libraries for this, just specify the right dialect, e.g. for tab-separated:

import csv
import json


with open("tmp4.csv", "r") as f:
    result = [
        {
            "a": row["a"],
            "Purchase": {
                "b": row["b"],
                "c": row["c"],
            },
            "Sales": {
                "d": row["d"],
                "e": row["e"],
            },
        }
        for row in csv.DictReader(f, dialect='excel-tab')
    ]
assert (
    json.dumps(result)
    == '[{"a": "1", "Purchase": {"b": "2", "c": "3"}, "Sales": {"d": "4", "e": "5"}}, {"a": "9", "Purchase": {"b": "8", "c": "7"}, "Sales": {"d": "6", "e": "5"}}]'
)

Comments

0

When you do r["purchase"] = {"b": ...}, you're assigning the dictionary back to per-line object r which gets discarded at the end of the loop. Instead, create a new dictionary per record and append that to a list. Like:

result = []
with open("new_data.csv") as f:
    reader = csv.DictReader(f)
    for r in reader:
        result.append({
            "a": r["a"],
            "Purchase" : {
                "b": r["b"],
                "c": r["c"],
                "d": r["d"],
            },
            "Sales": {
                "d": r["d"],
                "e": r["e"],
            },
        })

And to use a list comprehension to create result:

with open("new_data.csv") as f:
    reader = csv.DictReader(f)
    result = [{
        "a": r["a"],
        "Purchase" : {
            "b": r["b"],
            "c": r["c"],
            "d": r["d"],
        },
        "Sales": {
            "d": r["d"],
            "e": r["e"],
        },
    } for r in reader]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.