20. Dates
This script covers some of the more common issues we may face while dealing with dates.
1 Date Details
Look at strip time format for guidance
Check the local time zone
2 Creating Daily Dates
Create date columns out of the mangled date data we have loaded.
# Create good date column
new_dates <- sad_dates %>%
mutate(new_good = as.Date(good))
# Correct bad date column
new_dates <- new_dates %>%
mutate(new_bad = as.Date(bad, format = "%m/%d/%y"))
# Correct ugly date column
new_dates <- new_dates %>%
mutate(new_ugly = seq(as.Date("1998-01-13"), as.Date("1998-01-21"), by = "day"))3 Creating Hourly Dates
If we want to create date values out of data that have hourly values (or smaller), we must create ‘POSIXct’ values because ‘Date’ values may not have a finer temporal resolution than one day.
# Correcting good time stamps with hours
new_dates <- new_dates %>%
mutate(new_good_hours = as.POSIXct(good_hours, tz = "Africa/Mbabane"))
# Correcting bad time stamps with hours
new_dates <- new_dates %>%
mutate(new_bad_hours = as.POSIXct(bad_hours, format = "%Y-%m-%d %I:%M:%S %p", tz = "Africa/Mbabane"))
# Correcting bad time stamps with hours
new_dates <- new_dates %>%
mutate(new_ugly_hours = seq(as.POSIXct("1998-01-13 09:00:00", tz = "Africa/Mbabane"),
as.POSIXct("1998-01-13 17:00:00", tz = "Africa/Mbabane"), by = "hour"))But should not there be a function that loads dates correctly?
4 Importing Dates in One Step
Why yes, yes there is. read_csv() is the way to go.
But why does it matter that we correct the values to dates? For starters, it affects the way our plots look/work. Let us create some random numbers for plotting, see how these compare against our date values when we create figures.
# Generate random number
smart_dates$numbers <- rnorm(9, 2, 10)
# Scatterplot with correct dates
ggplot(smart_dates, aes(x = good, y = numbers)) +
geom_point() +
geom_smooth(method = "lm", se = F)
# Scatterplot with incorrect dates
ggplot(smart_dates, aes(x = bad, y = numbers)) +
geom_point() +
geom_smooth(method = "lm", se = F)
# OR
ggplot(smart_dates, aes(x = ugly, y = numbers)) +
geom_point() +
geom_smooth(method = "lm", se = F)If the dates are formatted correctly it also allows us to do schnazy things with the data.
R> [1] "1998-02-17"
R> Time difference of 6 days
R> [1] "1998-01-21" "1998-01-20" "1998-01-19" "1998-01-18" "1998-01-17"
R> [6] "1998-01-16" "1998-01-15"
R> [1] "1970-01-01"
Reuse
Citation
@online{a._j.2021,
author = {A. J. , Smit},
title = {20. {Dates}},
date = {2021-01-01},
url = {http://samos-r.netlify.app/intro_r/20-dates.html},
langid = {en}
}



