In this video, we are extracting the year from a date variable. Sometimes there's just too much information in the data we have, like when we have a variable with the exact date but, we only want the year of an event.
A. If the date is stored as a string, like "20250911", we can use the substring function built into R.
B. If the date is stored as a date object, using a substring may fail. We can format the date so that only the year is shown.
creating data for demonstration:
my = data.frame(when = c("20230911", "20240911", "20250911", "20250110", "20230201", "20241225", "20250101"))
my$reverse = c("11092023", "11092024", "11092025", "10012025", "01022023", "25122024", "01012025")
my$dates = as.POSIXct(c("2023-09-11", "2024-09-11", "2025-09-11", "2025-01-10", "2023-02-01", "2024-12-25", "2025-01-01"))
date stored as a string:
substr(my$reverse,5,9)
my$year = as.numeric(substr(my$when,1,4))
date stored as date object:
my$year = as.numeric(format(my$dates, "%Y"))
00:20 String