*37*

Often you may be interested in counting the number of observations by group in a pandas DataFrame.

Fortunately this is easy to do using theÂ **groupby()** andÂ **size()** functions with the following syntax:

df.groupby('column_name').size()

This tutorial explains several examples of how to use this function in practice using the following data frame:

import numpy as np import pandas as pd #create pandas DataFrame df = pd.DataFrame({'team': ['A', 'A', 'B', 'B', 'B', 'C', 'C'], 'division':['E', 'W', 'E', 'E', 'W', 'W', 'E'], 'rebounds': [11, 8, 7, 6, 6, 5, 12]}) #display DataFrame print(df) team division rebounds 0 A E 11 1 A W 8 2 B E 7 3 B E 6 4 B W 6 5 C W 5 6 C E 12

**Example 1: Count by One Variable**

The following code shows how to count the total number of observations by team:

#count total observations by variable 'team' df.groupby('team').size() team A 2 B 3 C 2 dtype: int64

From the output we can see that:

- Team A has 2 observations
- Team B has 3 observations
- Team C has 2 observations

Note that the previous code produces a Series. In most cases we want to work with a DataFrame, so we can use theÂ **reset_index()** function to produce a DataFrame instead:

df.groupby('team').size().reset_index(name='obs') team obs 0 A 2 1 B 3 2 C 2

**Example 2: Count and Sort by One Variable**

We can also use theÂ **sort_values()** function to sort the group counts.

We can specifyÂ **ascending=False** to sort group counts from largest to smallest or **ascending=True** to sort from smallest to largest:

df.groupby('team').size().reset_index(name='obs').sort_values(['obs'], ascending=True) team obs 0 A 2 2 C 2 1 B 3

**Example 3: Count by Multiple Variables**

We can also count the number of observations grouped by multiple variables in a pandas DataFrame:

#count observations grouped by team and division df.groupby(['team', 'division']).size().reset_index(name='obs') team division obs 0 A E 1 1 A W 1 2 B E 2 3 B W 1 4 C E 1 5 C W 1

From the output we can see that:

- 1 observation belongs to Team A and division E
- 1 observation belongs to Team A and division W
- 2 observations belongs to Team B and division E
- 1 observation belongs to Team B and division W
- 1 observation belongs to Team C and division E
- 1 observation belongs to Team C and division W

**Additional Resources**

How to Calculate the Sum of Columns in Pandas

How to Calculate the Mean of Columns in Pandas

How to Find the Max Value of Columns in Pandas