0

I have a data structure which the output looks like below:

['409597', ['You'], '409597', ['matter'], '409597', ['manager'], '809558', ['metro'], '809558', ['station'], '829258', ['bucket'], '829258', ['water'], '867297'..........]

I want to find a way to make this into several arrays which are merged based on the id number. The format I want is as follow:

['You','matter','manager'],
['metro','station'],
['bucket','water'],
...........

How can I achieve such a list of arrays (in Python) based on the similar ID of the original output I have?

2
  • 1
    how did you end up with that wierd data structure in the first place? Commented Jul 30, 2018 at 10:37
  • my original was a query of dict format as follow: [{'Instant_ID': 409597}, {'Token': 'You'}] [{'Instant_ID': 409597}, {'Token': 'matter'}] [{'Instant_ID': 409597}, {'Token': 'manager'}] [{'Instant_ID': 809558}, {'Token': 'metro'}] [{'Instant_ID': 809558}, {'Token': 'station'}] [{'Instant_ID': 829258}, {'Token': 'bucket'}] [{'Instant_ID': 829258}, {'Token': 'water'}] Commented Jul 30, 2018 at 10:39

3 Answers 3

1

Use zip_longest to create key-value pairs of elements and then use itertools.groupby to group them together based on the key.

>>> from itertools import groupby
>>> l = ['409597', ['You'], '409597', ['matter'], '409597', ['manager'], '809558', ['metro'], '809558', ['station'], '829258', ['bucket'], '829258', ['water']]
>>> [[e[1][0] for e in list(v)] for k,v in groupby(zip_longest(*([iter(l)]*2)), lambda x: x[0])]
[['You', 'matter', 'manager'], ['metro', 'station'], ['bucket', 'water']]

Actually, it is slightly easier to work with your original list of dict.

ld = [[{'Instant_ID': 409597}, {'Token': 'You'}], [{'Instant_ID': 409597}, {'Token': 'matter'}], [{'Instant_ID': 409597}, {'Token': 'manager'}], [{'Instant_ID': 809558}, {'Token': 'metro'}], [{'Instant_ID': 809558}, {'Token': 'station'}], [{'Instant_ID': 829258}, {'Token': 'bucket'}], [{'Instant_ID': 829258}, {'Token': 'water'}]]
>>> [[e[1]['Token'] for e in v] for k,v in groupby(ld, lambda x: x[0]['Instant_ID'])]
[['You', 'matter', 'manager'], ['metro', 'station'], ['bucket', 'water']]
Sign up to request clarification or add additional context in comments.

Comments

0

From your source.

from itertools import groupby

l = [[{'Instant_ID': 409597}, {'Token': 'You'}], [{'Instant_ID': 409597}, {'Token': 'matter'}], [{'Instant_ID': 409597}, {'Token': 'manager'}], [{'Instant_ID': 809558}, {'Token': 'metro'}], [{'Instant_ID': 809558}, {'Token': 'station'}], [{'Instant_ID': 829258}, {'Token': 'bucket'}], [{'Instant_ID': 829258}, {'Token': 'water'}]]
print([[i[1]["Token"] for i in value] for key, value in groupby(l, lambda x: x[0]["Instant_ID"])])

Output:

[['You', 'matter', 'manager'], ['metro', 'station'], ['bucket', 'water']]

Comments

0

You can do it using pure Python as well:

inList = ['409597', ['You'], '409597', ['matter'], '409597', ['manager'], '809558', ['metro'], '809558', ['station'], '829258', ['bucket'], '829258', ['water']]
#Find all number strings
numbers = sorted(set([inList[i] for i in range(0,len(inList),2)]))
#Create lists based on these number strings
newLists = [[inList[i][0] for i in range(1,len(inList),2) if inList[i-1]==number] for number in numbers]
print(newLists)

Output:

[['You', 'matter', 'manager'], ['metro', 'station'], ['bucket', 'water']]

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.