You can use the following basic syntax to group rows by month in a pandas DataFrame:
df.groupby(df.your_date_column.dt.month)['values_column'].sum()
This particular formula groups the rows by date in your_date_column and calculates the sum of values for the values_column in the DataFrame.
Note that the dt.month() function extracts the month from a date column in pandas.
The following example shows how to use this syntax in practice.
Example: How to Group by Month in Pandas
Suppose we have the following pandas DataFrame that shows the sales made by some company on various dates:
import pandas as pd
#create DataFrame
df = pd.DataFrame({'date': pd.date_range(start='1/1/2020', freq='W', periods=10),
'sales': [6, 8, 9, 11, 13, 8, 8, 15, 22, 9],
'returns': [0, 3, 2, 2, 1, 3, 2, 4, 1, 5]})
#view DataFrame
print(df)
date sales returns
0 2020-01-05 6 0
1 2020-01-12 8 3
2 2020-01-19 9 2
3 2020-01-26 11 2
4 2020-02-02 13 1
5 2020-02-09 8 3
6 2020-02-16 8 2
7 2020-02-23 15 4
8 2020-03-01 22 1
9 2020-03-08 9 5
Related: How to Create a Date Range in Pandas
We can use the following syntax to calculate the sum of sales grouped by month:
#calculate sum of sales grouped by month
df.groupby(df.date.dt.month)['sales'].sum()
date
1 34
2 44
3 31
Name: sales, dtype: int64
Here’s how to interpret the output:
- The total sales made during month 1 (January) was 34.
- The total sales made during month 2 (February) was 44.
- The total sales made during month 3 (March) was 31.
We can use similar syntax to calculate the max of the sales values grouped by month:
#calculate max of sales grouped by month
df.groupby(df.date.dt.month)['sales'].max()
date
1 11
2 15
3 22
Name: sales, dtype: int64
We can use similar syntax to calculate any value we’d like grouped by the month value of a date column.
Note: You can find the complete documentation for the GroupBy operation in pandas here.
Additional Resources
The following tutorials explain how to perform other common operations in pandas:
Pandas: How to Calculate Cumulative Sum by Group
Pandas: How to Count Unique Values by Group
Pandas: How to Calculate Correlation By Group