From the course: Python for Data Science and Machine Learning Essential Training Part 1

Unlock the full course today

Join today to access over 24,900 courses taught by industry experts.

Cleaning and treating categorical variables

Cleaning and treating categorical variables

- [Instructor] Let's take another look at categorical variables and why we might need to treat categorical variables as well as the options we have for treating them. As you recall, a categorical variable is a type of variable that can take on only a limited or fixed number of possible values. For example, fruit types is a categorical variable as there are only a limited number of types of fruits. Say for example, apples, oranges, lemons, there's not an infinite number of fruit types, so it's categorical. In the field of machine learning, it's common to come across categorical variables when addressing data science challenges. Typically, machine learning algorithms are not equipped to directly process categorical data. Therefore, we have to transform this type of data into numerical formats that are compatible with machine learning algorithms. This transformation can be done through various methods, including label encoding, one-hot encoding, among others. The conversion of…

Contents