1

so I have this CSV and I would like to do the following:

Original data:

enter image description here

Parsed Data:

enter image description here

So, to put in words, if a column has commas then I want to create a new column with only one value and delete the column which has multiple values.

For example: N2 has I1, I3 and I4. Hence the new data gets 3 columns, containing one value only.

I want to make it dynamic in such a way that all the permutations are reflected. Like in the case of N3 that has 2 places and 2 items.

I am trying to use python's pandas to do this. Some help would be appreciated.

1
  • for the first row the comma after I1 is not an error? all unique values in item finish by a comma? Commented Feb 24, 2019 at 17:34

4 Answers 4

1

Here is another option:

df['Place'] = df['Place'].str.split(',')
df['Item'] = df['Item'].str.split(',')

exploded = pd.DataFrame([
    a + [p, t] for *a, P, T in df.values
    for p in P for t in T
], columns=df.columns)

And the output:

  Name Place Item
0   N1    P1   I1
1   N2    P2   I1
2   N2    P2   I3
3   N2    P2   I4
4   N3    P2   I2
5   N3    P2   I5
6   N3    P3   I2
7   N3    P3   I5
Sign up to request clarification or add additional context in comments.

Comments

0

You are effectively attempting to take the Cartesian product of each row, then binding the result back into a DataFrame. As such, you could use itertools and do something like

from itertools import chain, product
df_lists = df.applymap(lambda s: s.split(','))
pd.DataFrame(chain.from_iterable(df_lists.apply(lambda row: product(*row), axis=1)), columns=df.columns)

With your example input:

In [334]: df
Out[334]:
  Name  Place      Item
0   N1     P1        I1
1   N2     P2  I1,I3,I4
2   N3  P2,P3     I2,I5

In [336]: df_lists = df.applymap(lambda s: s.split(','))

In [337]: pd.DataFrame(chain.from_iterable(df_lists.apply(lambda row: product(*row), axis=1)), columns=df.columns)
Out[337]:
  Name Place Item
0   N1    P1   I1
1   N2    P2   I1
2   N2    P2   I3
3   N2    P2   I4
4   N3    P2   I2
5   N3    P2   I5
6   N3    P3   I2
7   N3    P3   I5

Comments

0

You can use iterrows() :

df = pd.DataFrame({'Name': ['N1', 'N2', 'N3'], 'Place':['P1', 'P2','P2,P3'], 'Item':['I1,', 'I1,I3,I4', 'I2,I5']})

result = pd.DataFrame()
new_result = pd.DataFrame()

df['Place'] = df['Place'].apply(lambda x: x.strip(','))
df['Item'] = df['Item'].apply(lambda x: x.strip(','))

for a,b  in df.iterrows():
    curr_row = df.iloc[a]
    temp  = ((curr_row['Place'].split(',')))
    for x in temp:
        curr_row['Place'] = x
        result = result.append(curr_row, ignore_index=True)

for a,b  in result.iterrows():
    curr_row = result.iloc[a]
    temp  = ((curr_row['Item'].split(',')))
    for x in temp:
        curr_row['Item'] = x
        new_result = new_result.append(curr_row, ignore_index=True)

Output:

  Name Place Item
0   N1    P1   I1
1   N2    P2   I1
2   N2    P2   I3
3   N2    P2   I4
4   N3    P2   I2
5   N3    P2   I5
6   N3    P3   I2
7   N3    P3   I5

This is the simplest way you can achieve your desired output.

Comments

0

You can avoid the use of pandas. If you want to stick with the standard csv module, you simply have to split each field on comma (',') and then iterate on the splitted elements.

Code could be assuming the input delimiter is a semicolon (;) (I cannot know what it is except it cannot be a comma):

with open('input.csv', newline='') as fd, open('output.csv', 'w', newline='') as fdout:
    rd = csv.DictReader(fd, delimiter=';')
    wr = csv.writer(fdout)
    _ = wr.writerow(rd.fieldnames)
    for row in rd:
       for i in row['Item'].split(','):
           i = i.strip()
           if len(i) != 0:
               for p in row['Place'].split(','):
                   p = p.strip()
                   if len(p) != 0:
                       for n in row['Name'].split(','):
                           n = n.strip()
                           if len(n) != 0:
                               wr.writerow((n,p,i))

Output is:

Name,Place,Item
N1,P1,I1
N2,P2,I1
N2,P2,I3
N2,P2,I4
N3,P2,I2
N3,P3,I2
N3,P2,I5
N3,P3,I5

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.