Make Dealing with Dates a Little Easier!

Make Dealing with Dates a Little Easier!

Introduction

In this computing club mini session, we will cover the lubridate package and learn how to better work with dates and times in R. Lubridate was developed by Garrett Grolemund and Hadley Wickham, and is maintained by Vitalie Spinu. Oftentimes, investigators will provide time/date data in raw form, making it difficult to work with these variables. Conversions to the desirable form are tricky, and time-consuming. The functions in the lubridate package help to streamline and facilitate this process. Lubridate is not part of the tidyverse core (only need it when working with dates/times).

Three possible date/time formats

Date tibbles print this as <date>

Time tibbles print this as <time>

Date-time instant in time, tibbles print this as <dttm> (also called POSIXct in R)

Basics

Parsing

## [1] "2019-01-16"
## [1] "2019-01-16"
## [1] "2019-07-01"
## [1] "2019-07-01 17:27:30 EDT"

Applications

Creation

Let’s use the nycflights13 data

Three scenarios of creating a date/time variable:

String

Must specify correct input and parsing will convert to standard date format

## [1] "2018-04-08"
## [1] "2018-04-08"
## [1] "2018-04-08"
## [1] "2018-04-08"
## [1] "2017-01-31 20:11:59 UTC"
## [1] "2017-01-31 08:01:00 UTC"
## [1] "2017-01-31 UTC"

Individual date-time components

Existing date/time object

Switch between a date-time and a date --> as_datetime() and as_date()

## [1] "2019-07-01 UTC"
## [1] "2019-07-01"

Date/times as numeric offsets from Unix Epoch 1970-01-01

## [1] "1970-01-01 10:00:00 UTC"
## [1] "1980-01-01"

Date-time

Rounding options

floor_date()

Takes a date-time object and rounds it down to the nearest boundary of the specified time unit

round_date()

Takes a date-time object and rounds it to the nearest value of the specified time unit. Exactly halfway --> round up

ceiling_date()

Takes a date-time object and rounds it up to the nearest boundary of the specified time unit

Time spans

durations: measure the exact amount of time between two points

periods: track clock times despite leap years, leap seconds, and day light savings time

intervals: protean summary of the time information between two points

Durations

## Time difference of 9082 days

Let’s convert vicky_age to a duration using the lubridate package

## [1] "784684800s (~24.87 years)"

We can use the built in features of associated functions to extract the relevant information that we need:

## [1] "25s"
## [1] "3000s (~50 minutes)"
## [1] "82800s (~23 hours)"
## [1] "864000s (~1.43 weeks)"
## [1] "20563200s (~34 weeks)"
## [1] "378432000s (~11.99 years)"

Durations: give time span in seconds

Adding, subtracting, and multiplying

## [1] "1594980780s (~50.54 years)"
## [1] "1576800000s (~49.97 years)"
## [1] "2019-07-02"
## [1] "2018-07-01"

Periods

Motivation

## [1] "2017-02-10 14:00:00 EST"
## [1] "2017-03-22 15:00:00 EDT"

Like time spans, but without fixed length in seconds

## [1] "25S"
## [1] "45M 0S"
## [1] "16H 0M 0S"
## [1] "40d 0H 0M 0S"
## [1] "7m 0d 0H 0M 0S"
## [1] "161d 0H 0M 0S"
## [1] "30y 0m 0d 0H 0M 0S"

Adding, subtracting, and multiplying

## [1] "200m 0d 0H 0M 0S"
## [1] "30d 4H 10M 0S"
## [1] "2016-12-31"
## [1] "2017-01-01"
## [1] "2017-03-22 15:00:00 EDT"
## [1] "2017-03-22 14:00:00 EDT"

Application to the nycflights13 dataset

Selecting between duration, periods, and intervals

*Simplest = ideal

*Duration –> physical time

*Period –> human times

*Interval –> length of time span in human units

Permitted operations

Permitted operations

Further topics

Time zones!