1

I have a dataframe like this

Company_id  year  dummy_1 dummy_2 dummy_3 dummy_4 dummy_5
1           1990   1       0        1        1      1
1           1991   0       0        1        1      0
1           1992   0       0        1        1      0
1           1993   1       0        1        1      0
1           1994   0       1        1        1      0
1           1995   0       0        1        1      0
1           1996   0       0        1        1      1

I need last 5 columns as a vector and then append it the original dataframe. I know I can slices columns and creata a matrix such as:

df.as_matrix(columns=[df[-5:]])

Here is the resulting output that I want :

 Company_id  year  dummy_1 dummy_2 dummy_3 dummy_4 dummy_5   vector
    1           1990   1       0        1        1      1       [1, 0, 1, 1, 1]
    1           1991   0       0        1        1      0       [0, 0, 1, 1, 0]
    1           1992   0       0        1        1      0       [0, 0, 1, 1, 0]
    1           1993   1       0        1        1      0       [1, 0, 1, 1, 0]
    1           1994   0       1        1        1      0       [0, 1, 1, 1, 0]
    1           1995   0       0        1        1      0       [0, 0, 1, 1, 0]
    1           1996   0       0        1        1      1       [0, 0, 1, 1, 1]

But then how could I add it as an array to original dataset?

1
  • Sure, adding now Commented Aug 30, 2018 at 13:41

1 Answer 1

2

I believe need select last columns by iloc and assign new column converted to numpy arrays and lists:

df = df.assign(new = df.iloc[:, -5:].values.tolist())
print (df)
   Company_id  year  dummy_1  dummy_2  dummy_3  dummy_4  dummy_5  \
0           1  1990        1        0        1        1        1   
1           1  1991        0        0        1        1        0   
2           1  1992        0        0        1        1        0   
3           1  1993        1        0        1        1        0   
4           1  1994        0        1        1        1        0   
5           1  1995        0        0        1        1        0   
6           1  1996        0        0        1        1        1   

               new  
0  [1, 0, 1, 1, 1]  
1  [0, 0, 1, 1, 0]  
2  [0, 0, 1, 1, 0]  
3  [1, 0, 1, 1, 0]  
4  [0, 1, 1, 1, 0]  
5  [0, 0, 1, 1, 0]  
6  [0, 0, 1, 1, 1]  
Sign up to request clarification or add additional context in comments.

2 Comments

Sorry, should I say something toarray(), because I will use them for comparsion later on
@DogukanYılmaz - Then need df['new'] = df['new'].apply(np.array) ?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.