Home » Pandas: Create Date Column from Year, Month and Day

Pandas: Create Date Column from Year, Month and Day

by Tutor Aspire

You can use the following basic syntax to create a date column from year, month, and day columns in a pandas DataFrame:

df['date'] = pd.to_datetime(dict(year=df.year, month=df.month, day=df.day))

The following example shows how to use this syntax in practice.

Example: Create Date Column from Year, Month and Day in Pandas

Suppose we have the following pandas DataFrame that shows the sales made by some company on various dates:

import pandas as pd

#create DataFrame
df = pd.DataFrame({'year': [2021, 2022, 2022, 2022, 2022, 2022, 2022, 2022],
                   'month': [7, 1, 1, 2, 5, 10, 11, 12],
                   'day': [4, 15, 25, 27, 27, 24, 10, 18],
                   'sales': [140, 200, 250, 180, 130, 87, 90, 95]})

#view DataFrame
print(df)

   year  month  day  sales
0  2021      7    4    140
1  2022      1   15    200
2  2022      1   25    250
3  2022      2   27    180
4  2022      5   27    130
5  2022     10   24     87
6  2022     11   10     90
7  2022     12   18     95

We can use the following syntax to create a new column called date that combines the values from the year, month, and day columns in the DataFrame to create a date for each row:

#create date column from year, month, and day columns
df['date'] = pd.to_datetime(dict(year=df.year, month=df.month, day=df.day))

#view updated DataFrame
print(df)

   year  month  day  sales       date
0  2021      7    4    140 2021-07-04
1  2022      1   15    200 2022-01-15
2  2022      1   25    250 2022-01-25
3  2022      2   27    180 2022-02-27
4  2022      5   27    130 2022-05-27
5  2022     10   24     87 2022-10-24
6  2022     11   10     90 2022-11-10
7  2022     12   18     95 2022-12-18

Notice that the date column contains date values based on the values from the year, month, and day columns in each row.

If we use df.info() to get information about each column in the DataFrame, we can see that the new date column has a data type of datetime64:

#display information about each column in DataFrame
df.info()


RangeIndex: 8 entries, 0 to 7
Data columns (total 5 columns):
 #   Column  Non-Null Count  Dtype         
---  ------  --------------  -----         
 0   year    8 non-null      int64         
 1   month   8 non-null      int64         
 2   day     8 non-null      int64         
 3   sales   8 non-null      int64         
 4   date    8 non-null      datetime64[ns]
dtypes: datetime64[ns](1), int64(4)
memory usage: 388.0 bytes

Additional Resources

The following tutorials explain how to perform other common operations in pandas:

How to Add and Subtract Days from a Date in Pandas
How to Select Rows Between Two Dates in Pandas
How to Calculate a Difference Between Two Dates in Pandas

You may also like