3

I having array of JSON string, I need to convert it as array of object (i.e., convert the JSON to respective object) without a for loop.

Source Code: (Input data)

data = ['[1,2,3]', '[4,5,6]', '[7,8,9]']

Required Output:

[[1,2,3], [4,5,6], [7,8,9]]

I already using the following solution

import json

data = ['[1,2,3]', '[4,5,6]', '[7,8,9]']
output = []
for item in data:
    output.append(json.loads(item))

Currently I'm having a very large number of JSON strings (approx 100K records) and moreover each JSON String array internally contains the record approx 50K. While on execution it takes more than 3GB of RAM for processing.

Note: Implicitly the output is a 2-dim array [][]. 1st dimension is approx 100K records 2nd dimension contains approx 50K records. Totally 100K * 50K items.

While on conversion it takes more time to convert the JSON (for the above approach). Kindly assist me the idea to convert the JSON string without a for loop.

10
  • Getting rid of the loop won't make a noticeable difference. Commented Jun 11, 2018 at 6:02
  • Consider multiprocessing Commented Jun 11, 2018 at 6:08
  • um... data isn't in JSON format? Commented Jun 11, 2018 at 6:14
  • 1
    @Anthony : "JSON is built on two structures: A collection of name/value pairs (...) An ordered list of values. In most languages, this is realized as an array, vector, list, or sequence." Commented Jun 11, 2018 at 11:10
  • 1
    @Anthony it's indeed considered bad practice to not wrap the content in an object (mostly because of a security issue in Javascript that allow to inject code when deserialzing a json array), but it's nonetheless technically correct - and as a matter of fact, the json module does accept a json list. Commented Jun 11, 2018 at 11:12

2 Answers 2

2

Now solution looks wired but this works and will be useful for you in optimization. Convert the complete list into str then remove all ' single commas with str function and the apply json loads, hurray this has got worked for me.

data = ['[1,2,3]', '[4,5,6]', '[7,8,9]']
r = str(data).replace("'",'')

import json
data = json.loads(r)

now your data will be of list of list without looping. You can achieve this.

[[1, 2, 3], [4, 5, 6], [7, 8, 9]]
Sign up to request clarification or add additional context in comments.

Comments

0

ujson could make your code faster

import time
import json
import ujson

a_list = list(range(5000))
data = [str(a_list)] * 10000

s = time.time()

output = []
for item in data:
    output.append(json.loads(item))

print("json : %s" % (time.time()-s))

s = time.time()

output = []
for item in data:
    output.append(ujson.loads(item))

print("ujson : %s" % (time.time()-s))

On my PC...

json : 10.048374891281128
ujson : 6.533677577972412

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.