There are two ways to quickly extract the year from a date in R:
Method 1: Use format()
df$year as.Date(df$date, format="%d/%m/%Y"),"%Y")
Method 2: Use the lubridate package
library(lubridate)
df$year mdy(df$date))
This tutorial shows an example of how to use each of these methods in practice.
Method 1: Extract Year from Date Using format()
The following code shows how to extract the year from a date using the format() function combined with the “%Y” argument:
#create data frame df #view data frame df date sales 1 01/01/2021 34 2 01/04/2021 36 3 01/09/2021 44 #create new variable that contains year df$year as.Date(df$date, format="%d/%m/%Y"),"%Y") #view new data frame df date sales year 1 01/01/2021 34 2021 2 01/04/2021 36 2021 3 01/09/2021 44 2021
Note that this format() function works with a variety of date formats. You simply must specify the format:
#create data frame df #view data frame df date sales 1 2021-01-01 34 2 2021-01-04 36 3 2021-01-09 44 #create new variable that contains year df$year as.Date(df$date, format="%Y-%m-%d"),"%Y") #view new data frame df date sales year 1 01/01/2021 34 2021 2 01/04/2021 36 2021 3 01/09/2021 44 2021
Method 2: Extract Year from Date Using Lubridate
We can also use functions from the lubridate package to quickly extract the year from a date:
library(lubridate) #create data frame df #view data frame df date sales 1 01/01/2021 34 2 01/04/2021 36 3 01/09/2021 44 #create new variable that contains year df$year mdy(df$date)) #view new data frame df date sales year 1 01/01/2021 34 2021 2 01/04/2021 36 2021 3 01/09/2021 44 2021
Lubridate also works with a variety of date formats. You simply must specify the format:
#create data frame df #view data frame df date sales 1 2021-01-01 34 2 2021-01-04 36 3 2021-01-09 44 #create new variable that contains year df$year ymd(df$date)) #view new data frame df date sales year 1 01/01/2021 34 2021 2 01/04/2021 36 2021 3 01/09/2021 44 2021
Additional Resources
The following tutorials explain how to perform other common operations in R:
How to Loop Through Column Names in R
How to Remove Outliers from Multiple Columns in R