3

How can I flatten this json array and write out to a csv file using jq so that each row contains all the fields i.e.

I'm a bit of a novice with jq but started with this, which does not capture the inner nested array

jq -r '(.[0] | keys_unsorted) as $keys | $keys, map([.[ $keys[] ]])[]'

[
   {
      "name":"PM2.5",
      "unit":"ug/m3",
      "values":[
         {
            "timestamp":"2020-11-16T13:10:21.135Z",
            "value":3.696
         },
         {
            "timestamp":"2020-11-16T13:10:31.649Z",
            "value":3.696
         },
         {
            "timestamp":"2020-11-16T13:10:42.35Z",
            "value":3.696
         },
         {
            "timestamp":"2020-11-16T13:10:52.953Z",
            "value":3.696
         }
      ]
   },
   {
      "name":"PM10",
      "unit":"ug/m3",
      "values":[
         {
            "timestamp":"2020-11-16T13:10:21.135Z",
            "value":7.596
         },
         {
            "timestamp":"2020-11-16T13:10:31.649Z",
            "value":7.596
         },
         {
            "timestamp":"2020-11-16T13:10:42.35Z",
            "value":7.596
         },
         {
            "timestamp":"2020-11-16T13:10:52.953Z",
            "value":7.596
         }
      ]
   }
]

To a csv as follows:

"name", "unit", "timestamp", "value"
"PM2.5", "ug/m3", "2020-11-16T13:10:21.135Z", 3.696
"PM2.5", "ug/m3", "2020-11-16T13:10:31.649Z", 3.696
"PM2.5", "ug/m3", "2020-11-16T13:10:42.35Z", 3.696
"PM2.5", "ug/m3", "2020-11-16T13:10:52.953Z", 3.696  
"PM10", "ug/m3", "2020-11-16T13:10:21.135Z", 7.596
"PM10", "ug/m3", "2020-11-16T13:10:31.649Z", 7.596
"PM10", "ug/m3", "2020-11-16T13:10:42.35Z", 7.596
"PM10", "ug/m3", "2020-11-16T13:10:52.953Z", 7.596
2
  • @Shawn yes sorry was trying to add but was getting flagged for being mostly code by stack overflow, I have added an example now Commented Nov 16, 2020 at 14:44
  • @Shawn yep you are right, here is the minimal example Commented Nov 16, 2020 at 15:10

1 Answer 1

5
$ jq -r '
  [["name","unit","timestamp","value"]], 
   [ .[] 
     | [ .name, .unit ] + 
       (.values[] | [.timestamp,.value]) ] 
  | .[] 
  | @csv' input.json 
"name","unit","timestamp","value"
"PM2.5","ug/m3","2020-11-16T13:10:21.135Z",3.696
"PM2.5","ug/m3","2020-11-16T13:10:31.649Z",3.696
"PM2.5","ug/m3","2020-11-16T13:10:42.35Z",3.696
"PM2.5","ug/m3","2020-11-16T13:10:52.953Z",3.696
"PM10","ug/m3","2020-11-16T13:10:21.135Z",7.596
"PM10","ug/m3","2020-11-16T13:10:31.649Z",7.596
"PM10","ug/m3","2020-11-16T13:10:42.35Z",7.596
"PM10","ug/m3","2020-11-16T13:10:52.953Z",7.596

Builds an array of arrays, one per row including a header line, and then formats them all as CSV records.

Sign up to request clarification or add additional context in comments.

2 Comments

Thanks not quite the output format I was after, I have improved the question
This can be simplified (and made more efficient) by removing the need for the final .[], as follows: '["name", "unit", "timestamp", "value"] , (.[] | [.name, .unit] + (.values[] | [.timestamp, .value])) | @csv'

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.