1

I have the following json format that I need to convert to CSV

[{
            "name": "joe",
            "age": 21,
            "skills": [{
                "lang": "spanish",
                "grade": "47",
                "school": {
                    "name": "my school",
                    "url": "example.com/sp-school"
                }
            }, {
                "lang": "english",
                "grade": "87"
            }]

        },

        {
            "name": "sarah",
            "age": 34,
            "skills": [{
                "lang": "french",
                "grade": "47",
                "school": {
                    "name": "my school",
                    "url": "example.com/sp-school"
                }
            }, {
                "lang": "english",
                "grade": "87"
            }]

        }, {
            "name": "jim",
            "age": 26,
            "skills": [{
                "lang": "spanish",
                "grade": "60"

            }, {
                "lang": "english",
                "grade": "66",
                "school": {
                    "name": "eg school",
                    "url": "eg-school.com"

                }
            }]

        }
    ]

to convert to csv

name,age,grade,school,url,file,line_number
joe,21,47,"my school","example.com/sp-school",sample.json,1
jim,26,60,"","",sample.json,3

So add the top level fields and the object from the skills array if lang=spanish and the school hash from the skills object for spanish if it exists

I'd also like to add the file and line number it came from.

I would like to use jq for the job, but can't figure out the syntax , anyone help me out ?

1 Answer 1

2

With your data in input.json, and the following jq program in tocsv.jq:

.[]
| [.name, .age] +
  (.skills[]
  | select(.lang == "spanish")
  | [.grade, .school.name, .school.url, input_filename, input_line_number] )
| @csv

the invocation:

jq -r -f tocsv.jq  input.json

yields:

"joe",21,"47","my school","example.com/sp-school","input.json",51
"jim",26,"60",,,"input.json",51

If you want the number-valued strings converted to numbers, you could use the "tonumber" filter. If you want the null-valued fields replaced by strings, use e.g. .school.name // ""

Of course this approach doesn't yield a very useful line number. One approach that would yield higher granularity would be to stream the individual objects into jq, but then you'd lose the filename. To recover the filename you could pass it in as an argument. So you would have a pipeline like so:

jq -c '.[]' input.json | jq -r --arg file input.json -f tocsv2.jq

where tocsv2.jq would be like tscsv.jq above but without the initial .[] |, and with $file instead of input_filename.

Finally, please also consider using the TSV format (@tsv) rather than the rather messy CSV format (@csv).

Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.