% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/adls_flatten_date_intervals.R
\name{flatten_date_intervals}
\alias{flatten_date_intervals}
\title{Flatten Date Intervals}
\usage{
flatten_date_intervals(
  data,
  id,
  in_date,
  out_date,
  status = NULL,
  overlap_handling = "most_recent",
  lag = 0
)
}
\arguments{
\item{data}{A data frame, data frame extension (e.g. a tibble), or a lazy
data frame (e.g. from dbplyr or dtplyr).}

\item{id}{<\code{\link[=dplyr_tidy_select]{tidy-select}}> One or more unquoted
expression naming the id variables in data.}

\item{in_date}{<\code{\link[=dplyr_data_masking]{data-masking}}> One unquoted
expressions naming the start date variable in data.}

\item{out_date}{<\code{\link[=dplyr_data_masking]{data-masking}}> One unquoted
expression naming the end date variable in data.}

\item{status}{<\code{\link[=dplyr_tidy_select]{tidy-select}}> One or more unquoted
expressions naming a status variable in data, such as region or
hospitalization reason.}

\item{overlap_handling}{A character naming the method for handling overlaps
within an individuals time when \code{status} has been specified.
\itemize{
\item "none": No special handling of the overlapping time intervals within
person is done.
\item "first": The \code{status} mentioned first, that is, has the smallest
\code{in_date}, dominates.
\item "most_recent" (default): The most recent \code{status}, that is, the one with
the largest \code{in_date}, dominates. When the most recent \code{status} is fully
contained within an older (and different) \code{status} then the \code{out_date}
associated with the most recent \code{in_date} is kept, but the remaining time
from the older \code{status} is removed. See examples below.
}

We currently don't have a method that lets the most recent status dominate
and then potentially return to an older longer running status. If this is
needed, please contact ADLS.}

\item{lag}{A numeric, giving the number of days allowed between time
intervals that should be collapsed into one.}
}
\value{
A data frame with the \code{id}, \code{status} if specified and simplified \code{in_date}
and \code{out_date}. The returned data is sorted by \code{id} and \code{in_date}.
}
\description{
A tidyverse compatible function for simplifying time interval data
}
\details{
This functions identifies overlapping time intervals within individual and
collapses them into distinct and disjoint intervals. When \code{status} is
specified these intervals are both individual and status specific.

If \code{lag} is specified then intervals must be more then \code{lag} time units apart
to be considered distinct.
}
\examples{

### The flatten function works with both dates and numeric

dat <- data.frame(
   ID    = c(1, 1, 1, 2, 2, 3, 3, 4),
   START = c(1, 2, 5, 3, 6, 2, 3, 6),
   END   = c(3, 3, 7, 4, 9, 3, 5, 8))
dat |> flatten_date_intervals(ID, START, END)

dat <- data.frame(
   ID    = c(1, 1, 1, 2, 2, 3, 3, 4, 4),
   START = as.Date(c("2012-02-15", "2005-12-13", "2006-01-24",
                     "2002-03-14", "1997-02-27",
                     "2008-08-13", "1998-09-23",
                     "2005-01-12", "2007-05-10")),
   END   = as.Date(c("2012-06-03", "2007-02-05", "2006-08-22",
                     "2005-02-26", "1999-04-16",
                     "2008-08-22", "2015-01-29",
                     "2007-05-07", "2008-12-12")))
dat |> flatten_date_intervals(ID, START, END)



###  Allow for a 5 days lag between

dat |> flatten_date_intervals(ID, START, END, lag = 5)



### Adding status information

dat <- data.frame(
   ID     = c(1, 1, 1, 2, 2, 3, 3, 4, 4),
   START  = as.Date(c("2012-02-15", "2005-12-13", "2006-01-24",
                      "2002-03-14", "1997-02-27",
                      "2008-08-13", "1998-09-23",
                      "2005-01-12", "2007-05-10")),
   END    = as.Date(c("2012-06-03", "2007-02-05", "2006-08-22",
                      "2005-02-26", "1999-04-16",
                      "2008-08-22", "2015-01-29",
                     "2007-05-07", "2008-12-12")),
   REGION = c("H", "H", "N", "S", "S", "M", "N", "S", "S"))

# Note the difference between the the different overlap_handling methods
dat |> flatten_date_intervals(ID, START, END, REGION, "none")
dat |> flatten_date_intervals(ID, START, END, REGION, "first")
dat |> flatten_date_intervals(ID, START, END, REGION, "most_recent")

}
\author{
ADLS, EMTH & ASO
}
