1

This is my first time accessing an API / working with json data so if anyone can point me towards a good resource for understanding how to work with it I'd really appreciate it.

Specifically though, I have json data in this form:

{"result": { "code": "OK", "msg": "" },"report_name":"DAILY","columns":["ad","ad_impressions","cpm_cost_per_ad","cost"],"data":[{"row":["CP_CARS10_LR_774470","966","6.002019","5.797950"]}],"total":["966","6.002019","5.797950"],"row_count":1}

I understand this structure but I don't know how to get it into a DataFrame properly.

2
  • Could you post the whole JSON please. Commented Jul 24, 2015 at 17:53
  • 1
    You might be better off parsing out the column names and data and creating a dataframe yourself, rather than use the pd.read_json class. So parse out json['columns'] and json['data'][0]['row'] for columns and rows respectively, then construct your data frame as you would normally. Commented Jul 24, 2015 at 18:29

2 Answers 2

1

Looking at the structure of your json, presumably you will have several rows for your data and in my opinion it will make more sense to build the dataframe yourself.

This code uses columns and data to build a dataframe:

In [12]:

import json
import pandas as pd
​
with open('... path to your json file ...') as fp:
    for line in fp:
        obj = json.loads(line)
        columns = obj['columns']
        data = obj['data']
        l = []
        for d in data:
            l += [d['row']]
        df = pd.DataFrame(l, index=None, columns=columns)

df
Out[12]:
ad  ad_impressions  cpm_cost_per_ad cost
0   CP_CARS10_LR_774470 966 6.002019    5.797950

As for the rest of the data, in your json, I guess you could e.g. use the totals for checking your dataframe,

In [14]:

sums = df.sum(axis=0)
obj['total']
for i in range(0,3):
    if (obj['total'][i] != sums[i+1]):
        print "error in total"

In [15]:

if obj['row_count'] != len(df.index):
    print "error in row count"

As for the rest of the data in the json, it is difficult for me to know if anything else should be done.

Hope it helps.

Sign up to request clarification or add additional context in comments.

Comments

0

Check pandas documentation. Specifically,

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_json.html

Pandas supports reading json to dataframe

1 Comment

I get an error with pd.read_json(response) stating "arrays must all be the same length". The problem is that there is a final row, "Total" which has excludes ad names, I don't see any documentation on how to handle this from the above link

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.