*23*

You can use the **describe()** function to generate descriptive statistics for variables in a pandas DataFrame.

You can use the following basic syntax to use the **describe()** function with the **groupby()** function in pandas:

**df.groupby('group_var')['values_var'].describe()
**

The following example shows how to use this syntax in practice.

**Example: Use describe() by Group in Pandas**

Suppose we have the following pandas DataFrame that contains information about basketball players on two different teams:

**import pandas as pd
#create DataFrame
df = pd.DataFrame({'team': ['A', 'A', 'A', 'A', 'B', 'B', 'B', 'B'],
'points': [8, 12, 14, 14, 15, 22, 27, 24],
'assists':[2, 2, 3, 5, 7, 6, 8, 12]})
#view DataFrame
print(df)
team points assists
0 A 8 2
1 A 12 2
2 A 14 3
3 A 14 5
4 B 15 7
5 B 22 6
6 B 27 8
7 B 24 12**

We can use the **describe()** function along with the **groupby()** function to summarize the values in the **points** column for each **team**:

#summarize points by team df.groupby('team')['points'].describe() count mean std min 25% 50% 75% max team A 4.0 12.0 2.828427 8.0 11.00 13.0 14.00 14.0 B 4.0 22.0 5.099020 15.0 20.25 23.0 24.75 27.0

From the output, we can see the following values for the **points** variable for each team:

**count**(number of observations)**mean**(mean points value)**std**(standard deviation of points values)**min**(minimum points value)**25**% (25th percentile of points)**50**% (50th percentile (i.e. median) of points)**75**% (75th percentile of points)**max**(maximum points value)

If youâ€™d like the results to be displayed in a DataFrame format, you can use the **reset_index()** argument:

#summarize points by team df.groupby('team')['points'].describe().reset_index()team count mean std min 25% 50% 75% max 0 A 4.0 12.0 2.828427 8.0 11.00 13.0 14.00 14.0 1 B 4.0 22.0 5.099020 15.0 20.25 23.0 24.75 27.0

The variable **team** is now a column in the DataFrame and the index values are 0 and 1.

**Additional Resources**

The following tutorials explain how to perform other common operations in pandas:

Pandas: How to Calculate Cumulative Sum by Group

Pandas: How to Count Unique Values by Group

Pandas: How to Calculate Correlation By Group