From the course: Data Analysis with Python and Pandas
Unlock this course with a free trial
Join today to access over 24,900 courses taught by industry experts.
Categorical series aggregation
From the course: Data Analysis with Python and Pandas
Categorical series aggregation
- [Instructor] All right, so let's take a look at a few more types of aggregations we can perform on series. The methods here tend to work best on text fields or categorical fields that have values repeated throughout a series of data, but we can call them on numeric series as well. Our first method is unique. This will return an array of unique items in a series, nunique will return the number of unique items in a series, and value_counts returns a series of unique items and their frequency in our data. So here, we have our items series. Just note that coffee is repeated twice, all other values occur once. So, when we call value_counts on our item series, we can see coffee gets a count of two, tea gets a count of one, and so on. If we specify normalize equals true, we can return a percentage of the time these values occur in our data, which is often more useful than for analysis than a raw count. It depends on what we're trying to do, but usually, we'll want to say 40% of our sales…
Practice while you learn with exercise files
Download the files the instructor uses to teach the course. Follow along and learn by watching, listening and practicing.
Contents
-
-
-
-
(Locked)
Series basics10m
-
(Locked)
pandas data types and type conversion6m 46s
-
(Locked)
Challenge: Data types and type conversion2m 23s
-
(Locked)
Solution: Data types and type conversion3m 5s
-
(Locked)
The series index and custom indices7m 6s
-
(Locked)
The .iloc accessor4m 33s
-
(Locked)
The .loc accessor7m 3s
-
(Locked)
Duplicate index values and resetting the index6m 33s
-
(Locked)
Challenge: Accessing data and resetting the index2m 1s
-
(Locked)
Solution: Accessing data and resetting the index2m 39s
-
(Locked)
Filtering series and logical tests8m 19s
-
(Locked)
Sorting series3m 45s
-
(Locked)
Challenge: Sorting and filtering series57s
-
(Locked)
Solution: Sorting and filtering series3m 24s
-
(Locked)
Numeric series operations6m 31s
-
(Locked)
Text series operations7m 4s
-
(Locked)
Challenge: Series operations1m 36s
-
(Locked)
Solution: Series operations3m 53s
-
(Locked)
Numerical series aggregation5m 43s
-
(Locked)
Categorical series aggregation3m 32s
-
(Locked)
Challenge: Series aggregation50s
-
(Locked)
Solution: Series aggregation4m 20s
-
(Locked)
Missing data representation in pandas4m 29s
-
(Locked)
Identifying missing data2m 15s
-
(Locked)
Fixing missing data9m 27s
-
(Locked)
Challenge: Missing data45s
-
(Locked)
Solution: Missing data1m 35s
-
(Locked)
Applying custom functions to series4m 6s
-
(Locked)
pandas where() vs. NumPy where()6m 3s
-
(Locked)
Challenge: apply() and where()1m 9s
-
(Locked)
Solution: apply() and where()4m 37s
-
(Locked)
Key takeaways1m 24s
-
(Locked)
-
-
-
-
-
-
-
-