how to remove multiple columns in r dataframe?

Question

I am trying to remove some columns in a dataframe. I want to know why it worked for a single column but not with multible columns e.g. this works

album2[,5]<- NULL

this doesn't work:

album2[,c(5:7)]<- NULL
Error in `[<-.data.frame`(`*tmp*`, , 5:7, value = NULL) : 
replacement has 0 items, need 600

This also doesn't work:

for (i in 5: (length(album2)-1)){
 album2[,i]<- NULL
}
Error in `[<-.data.frame`(`*tmp*`, , i, value = NULL) : 
new columns would leave holes after existing columns

It would be great if you could supply a minimal reproducible example to go along with your question. Something we can work from and use to show you how it might be possible to answer your question. That way others can also befit form your question, and the accompanying answer, in the future. You can have a look at this SO post on how to make a great reproducible example in R. — Eric Fail
– Eric Fail, Commented Jan 5, 2016 at 17:42
@EricFail especially as, as far as I can tell, the first example "e.g. this works" doesn't actually work. — doctorG
– doctorG, Commented Jan 5, 2016 at 17:47
@doctorG using "list(NULL)" made it work with multiple columns , using NULL with a single column worked.i will take care of reproducibility in the future . — Ahmed Elmahy
– Ahmed Elmahy, Commented Jan 5, 2016 at 17:57

doctorG · Accepted Answer · 2016-01-05 17:43:35Z

64

Basic subsetting:

album2 <- album2[, -5] #delete column 5
album2 <- album2[, -c(5:7)] # delete columns 5 through 7

answered Jan 5, 2016 at 17:43

doctorG

1,7311 gold badge11 silver badges27 bronze badges

Sign up to request clarification or add additional context in comments.

4 Comments

Jia Gao Over a year ago

drop columns by their position is not recommended, at least for me.

doctorG Over a year ago

Yes, and no. The OP was posed in context of specifying column positions. If you know the desired positions, then this is fine. For others to know if your comment is useful/relevant to them, could you add why you would not recommend it?

Jia Gao Over a year ago

well, what if one adds a new column to his/her data then column position changed? I agree that your answer is correct, but it's neither safe nor efficient.

doctorG Over a year ago

It's implicit you're at the point of knowing what column numbers you want. Getting to that point is up to you. Considering whether you're doing it interactively or programmatically (and thus what conditions you need to cope with) is also up to you.

Andrew Haynes · Accepted Answer · 2018-07-19 12:42:26Z

42

Adding answer as this was the top hit when searching for "drop multiple columns in r":

The general version of the single column removal, e.g df$column1 <- NULL, is to use list(NULL):

df[ ,c('column1', 'column2')] <- list(NULL)

This also works for position index as well:

df[ ,c(1,2)] <- list(NULL)

This is a more general drop and as some comments have mentioned, removing by indices isn't recommended. Plus the familiar negative subset (used in other answers) doesn't work for columns given as strings:

> iris[ ,-c("Species")]
Error in -"Species" : invalid argument to unary operator

answered Jul 19, 2018 at 12:42

Andrew Haynes

2,6402 gold badges22 silver badges35 bronze badges

1 Comment

vasili111 Over a year ago

Can you please explain why list(NULL) and not just NULL?

marc_s · Accepted Answer · 2021-06-17 14:51:57Z

12

This works for me.

x <-dplyr::select(dataset_df, -c('column1', 'column2'))

edited Jun 17, 2021 at 14:51

marc_s

760k186 gold badges1.4k silver badges1.5k bronze badges

answered May 14, 2020 at 11:29

Dulakshi Soysa

3423 silver badges7 bronze badges

Comments

Yoh Deadfall · Accepted Answer · 2018-03-26 14:43:27Z

10

If you only want to remove columns 5 and 7 but not 6 try:

album2 <- album2[,-c(5,7)] #deletes columns 5 and 7

edited Mar 26, 2018 at 14:43

Yoh Deadfall

2,7917 gold badges30 silver badges32 bronze badges

answered Mar 26, 2018 at 14:25

Kara

1211 silver badge6 bronze badges

Comments

ExploreR · Accepted Answer · 2019-01-24 13:59:24Z

7

@Ahmed Elmahy following approach should help you out, when you have got a vector of column names you want to remove from your dataframe:

test_df <- data.frame(col1 = c("a", "b", "c", "d", "e"), col2 = seq(1, 5), col3 = rep(3, 5))
rm_col <- c("col2")
test_df[, !(colnames(test_df) %in% rm_col), drop = FALSE]

All the best, ExploreR

answered Jan 24, 2019 at 13:59

ExploreR

3535 silver badges16 bronze badges

1 Comment

Kyouma Over a year ago

what is drop doing in this context?

Sandy · Accepted Answer · 2021-11-24 10:09:01Z

Another solution, similar to @Dulakshi Soysa, is to use column names and then assign a range.

For example, if our data frame df(), has column names defined as column_1, column_2, column_3 up to column_15. We are interested in deleting the columns from the 5th to the 10th.

We can specify a range using column names e.g.,

library(dplyr)
x = select(df, -c('column_5':'column_10'))

Specifying the range can save some time when you are deleting multiple adjacent columns. It can also be used if you want to use some adjacent and some non-adjacent columns. For example, if you want to remove the 1st column in addition to the previously specified columns, you would update the code as below:

library(dplyr)
x = select(df, -c('column_1', 'column_5':'column_10'))

Anoushiravan R · Accepted Answer · 2021-04-22 00:28:07Z

1

The following line will remove col_1 and col_2 from the data frame 'data'

data[!(colnames(data) %in% c('col_1','col_2'))]

edited Apr 22, 2021 at 0:28

Anoushiravan R

22k3 gold badges22 silver badges44 bronze badges

answered Apr 22, 2021 at 0:24

Jacob

111 bronze badge

Comments

Anoushiravan R · Accepted Answer · 2021-04-22 00:37:31Z

1

Here is an interesting solution I read the other day in @JoachimSchork's blog, Statistics Globe. You can remove columns by column name. You can find out more here.

library(data.table)

mtcars2 <- mtcars

setDT(mtcars2)[, c("mpg", "cyl", "disp", "hp") := NULL]

> head(mtcars2)
   drat    wt  qsec vs am gear carb
1: 3.90 2.620 16.46  0  1    4    4
2: 3.90 2.875 17.02  0  1    4    4
3: 3.85 2.320 18.61  1  1    4    1
4: 3.08 3.215 19.44  1  0    3    1
5: 3.15 3.440 17.02  0  0    3    2
6: 2.76 3.460 20.22  1  0    3    1

answered Apr 22, 2021 at 0:37

Anoushiravan R

22k3 gold badges22 silver badges44 bronze badges

Collectives™ on Stack Overflow

how to remove multiple columns in r dataframe?

8 Answers 8

4 Comments

1 Comment

Comments

Comments

1 Comment

Comments

Comments

Comments

Your Answer

Linked

Hot Network Questions

Collectives™ on Stack Overflow

8 Answers 8

4 Comments

1 Comment

Comments

Comments

1 Comment

Comments

Comments

Comments

Your Answer

Sign up or log in

Post as a guest

Linked

Related