4

I am a newbie in python. I have a data frame that looks like this:

    A   B   C   D   E
0   1   0   1   0   1
1   0   1   0   0   1
2   0   1   1   1   0
3   1   0   0   1   0
4   1   0   0   1   1

How can I write a for loop to gather the column names for each row. I expect my result set looks like that:

    A   B   C   D   E   Result
0   1   0   1   0   1   ACE
1   0   1   0   0   1   BE
2   0   1   1   1   0   BCD
3   1   0   0   1   0   AD
4   1   0   0   1   1   ADE

Anyone can help me with that? Thank you!

1 Answer 1

9

The dot function is done for that purpose as you want the matrix dot product between your matrix and the vector of column names:

df.dot(df.columns)
Out[5]: 
0    ACE
1     BE
2    BCD
3     AD
4    ADE

If your dataframe is numeric, then obtain the boolean matrix first by test your df against 0:

(df!=0).dot(df.columns)

PS: Just assign the result to the new column

df['Result'] = df.dot(df.columns)

df
Out[7]: 
   A  B  C  D  E Result
0  1  0  1  0  1    ACE
1  0  1  0  0  1     BE
2  0  1  1  1  0    BCD
3  1  0  0  1  0     AD
4  1  0  0  1  1    ADE
Sign up to request clarification or add additional context in comments.

4 Comments

@zihan. what going on hidden from view is Boolean indexing eval. This ONLY works because your matrix values are 0's and 1's.
Thank you for help me answer my question but I have a error shows up >TypeError: can't multiply sequence by non-int of type 'float'
@Merlin sure: with integers n>1 you get n times the value. numpy behind fulfills python logic so you multiply n * 'column title'. That fits OP's matrix form though
@Merlin added a line to explain how to turn a float matrix quickly into a boolean one.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.