From the course: R for Data Science: Lunch Break Lessons

Use droplevels() to simplify factors

- [Instructor] Drop levels allows you to simplify factors, and if you're unclear on a factor, I'll refer you back to one of the earliest sessions in the R Language Weekly series. So to demonstrate factors, I need a factor. Let's create one. I'm going to create one here in the first line called some factors, and I'll hit return to run that. You'll notice that I'm in the console, so I don't need to hit Command + Return, and I now have a factor. We can take a look at that. I'll just type in some factors, and you can see that I have four items in it, an apple, an apple, a banana, and a cherry, and the levels in some factors is apple, banana, and cherry. Now I can use table to count the items in some factors. You can see I have two apples, one banana and one cherry. I can use levels to find out what the levels are in this factor. So the levels are apples or banana or cherry, and I could even plot that. So let's plot some factors, and you can see in the plots panel that I now have a simple plot with two apples, one banana, and one cherry. Okay, so let's say that we are only interested in red fruits, and so I'll need to get rid of that banana. And to do that, I can use some factors, and I'll select the third item, which happens to be a banana, and I can prove that by typing in that return. Some factors, bracket three is a banana. Well, let's get rid of that by assigning it to NA which is not available. Now when I type out some factors, you'll see that I have an apple, an apple, nothing, and a cherry. Well, but that level still appears, so if I type in table, some factors, you'll see that I have an apple, a cherry. I have zero bananas, but bananas is till in that factor, so how do I get rid of banana entirely? Well, this is where drop levels comes in. I'm going to create a vector called no bananas, and into it, I'm going to place the result of drop levels against some factors. Now drop levels is going to remove any unused factor. So now if I type in table, some factors, you can see I still have bananas, but if I type in table, no bananas, the banana factor level has gone from no bananas. It's the same with plot, and levels. So drop levels is used with factors to remove unused factors from a variable.

Contents