25

I have a pandas dataframe with 10 rows and 5 columns and a numpy matrix of zeros np.zeros((10,3)).

I want to concat the numpy matrix to the pandas dataframe but I want to delete the last column from the pandas dataframe before concatenating the numpy array to it.

So I will end up with a matrix of 10 rows and 5 - 1 + 3 = 7 columns.

I guess I could use

new_dataframe = pd.concat([
    original_dataframe,
    pd.DataFrame(np.zeros((10, 3)), dtype=np.int)
], axis=1, ignore_index=True)

where original_dataframe has 10 rows and 5 columns.

How do I delete the last column from original_dataframe before concatenating the numpy array? And how do I make sure I preserve all the data types?

2
  • 2
    you can slice the original df new_dataframe = pd.concat([original_dataframe.ix[:, :-1], pd.DataFrame(np.zeros((10, 3)), dtype=np.int)], axis=1, ignore_index=True) with regards to your last comment aren't the datatypes preserved anyway? Commented Sep 26, 2016 at 8:56
  • ix is deprecated now, so consider using iloc or loc. See my answer below. Commented Dec 18, 2018 at 4:34

1 Answer 1

23

Setup

np.random.seed(0)
df = pd.DataFrame(np.random.choice(10, (3, 3)), columns=list('ABC'))
df

   A  B  C
0  5  0  3
1  3  7  9
2  3  5  2

np.column_stack / stack(axis=1) / hstack

pd.DataFrame(pd.np.column_stack([df, np.zeros((df.shape[0], 3), dtype=int)]))
    
   0  1  2  3  4  5
0  5  0  3  0  0  0
1  3  7  9  0  0  0
2  3  5  2  0  0  0

Useful (and performant), but does not retain the column names from df. If you really want to slice out the last column, use iloc and slice it out:

pd.DataFrame(pd.np.column_stack([
    df.iloc[:, :-1], np.zeros((df.shape[0], 3), dtype=int)]))

   0  1  2  3  4
0  5  0  0  0  0
1  3  7  0  0  0
2  3  5  0  0  0

pd.concat

You will need to convert the array to a DataFrame.

df2 = pd.DataFrame(np.zeros((df.shape[0], 3), dtype=int), columns=list('DEF'))
pd.concat([df, df2], axis=1)
 
   A  B  C  D  E  F
0  5  0  3  0  0  0
1  3  7  9  0  0  0
2  3  5  2  0  0  0

DataFrame.assign

If it's only adding constant values, you can use assign:

df.assign(**dict.fromkeys(list('DEF'), 0))

   A  B  C  D  E  F
0  5  0  3  0  0  0
1  3  7  9  0  0  0
2  3  5  2  0  0  0
Sign up to request clarification or add additional context in comments.

Comments

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.