% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/brokenstick.R
\name{brokenstick}
\alias{brokenstick}
\title{Fit a \code{brokenstick} model to irregular data}
\usage{
brokenstick(
  formula,
  data,
  knots = NULL,
  boundary = NULL,
  k = NULL,
  degree = 1L,
  method = c("kr", "lmer"),
  control = set_control(method = method, ...),
  na.action = na.exclude,
  light = FALSE,
  ...
)
}
\arguments{
\item{formula}{A formula specifying the outcome, the predictor and the group
variable in \code{data}. The generic shape is \code{formula = y ~ x | group}. The
left-hand side is the outcome, the right-hand side the predictor, and the
name of the grouping variable occurs after the \code{|} sign. Formula treatment
is non-standard: 1) \code{y} and \code{x} should be numeric, 2) only one variable
is allowed in each model term (additional variables will be ignored).}

\item{data}{A data frame or matrix containing the outcome (numeric),
predictor (numeric) and group (numeric, factor, character) variable.}

\item{knots}{Optional, but recommended. Numerical vector with the
locations of the internal knots to be placed on the values of the predictor.}

\item{boundary}{Optional, but recommended. Numerical vector of
length 2 with the left and right boundary knot. The \code{boundary}
setting is passed to \code{\link[splines:bs]{splines::bs()}} as the \code{Boundary.knots} argument.
If not specified, the function determines the boundary knots as
\code{range(x)}. When specified, the \code{boundary} range is internally
expanded to include at least \code{range(knots)}.}

\item{k}{Optional, a convenience parameter for the number of
internal knots. If specified, then \code{k} internal knots are placed
at equidense quantiles of the predictor. For example,
specifying \code{k = 1} puts a knot at the 50th quantile (median),
setting \code{k = 3} puts knots at the 25th, 50th and 75th quantiles,
and so on. If the user specifies both \code{k} and \code{knots} arguments
then \code{knots} takes precedence.}

\item{degree}{the degree of the spline. The broken stick model
requires linear splines, so the default is \code{degree = 1}.
Setting \code{degree = 0} yields (crisp) dummy coding, and one
column less than for \code{degree = 1}. The \code{brokenstick} package supports
only \code{degree = 0} and \code{degree = 1}.}

\item{method}{Estimation method. Either \code{"kr"} (for the
Kasim-Raudenbush sampler) or \code{"lmer"} (for \code{\link[lme4:lmer]{lme4::lmer()}}).
Version 1.1.1.9000 changed the default to \code{method = "kr"}.}

\item{control}{List of control options returned by \code{\link[=set_control]{set_control()}} used
to set algorithmic details. A list with parameters. When not specified,
the functions sets to defaults
for method \code{"kr"} by \code{\link[=control_kr]{control_kr()}}, and
for method \code{"lmer"} by \code{\link[lme4:lmerControl]{lme4::lmerControl()}}. For ease of use, the user
may set individual options to \code{"kr"} (e.g. \code{niter = 500}) via the \dots
arguments.}

\item{na.action}{A function that indicates what \code{\link[lme4:lmer]{lme4::lmer()}} should so
when the data contain \code{NA}s. Default set to \code{na.exclude}. Only used by
method \code{"lmer"}.}

\item{light}{Should the returned object be lighter? If \code{light = TRUE}
the returned object will contain only the model settings and parameter
estimates and not store the \code{data}, \code{imp} and \code{mod} elements. The light
object can be used to predict broken stick estimates for new data, but
does not disclose the training data and is very small (often <20 Kb).}

\item{\dots}{Forwards arguments to \code{\link[=control_kr]{control_kr()}}.}
}
\value{
A object of class \code{brokenstick}.
}
\description{
The \code{brokenstick()} function fits an irregularly observed series
of measurements onto a user-specified grid of points (knots).
The model codes the grid by a series of linear B-splines.
Each modelled trajectory consists of straight lines that join at
the chosen knots and look like a broken stick. Differences between
observations are expressed by a random effect per knot.
}
\details{
The choice between \code{method = "kr"} and \code{method = "lmer"} depends on the size
of the data and the complexity of the model. In general, setting \code{method = "lmer"}
can require substantial calculation time for more complex models
(say > 8 internal knots) and may not converge. Method \code{"kr"} is less
sensitive to model complexity and small samples, and has the added benefit that the
variance-covariance matrix of the random effects can be constrained through the
\code{cormodel} argument. On the other hand, \code{"lmer"} is the better-researched
method, and is more efficient for simpler models and datasets with many
rows.

The default algorithm since version 2.0 is the Bayesian Kasim-Raudenbush
sampler (\code{method = "kr"}). The variance-covariance matrix of the broken stick
estimates absorbs the relations over time. The \code{"kr"} method allows
enforcing a simple structure on this variance-covariance matrix. Currently,
there are three such correlation models: \code{"none"} (default), \code{"argyle"}
and \code{"cole"}. Specify the \code{seed} argument for reproducibility.
See \code{\link[=control_kr]{control_kr()}} for more details.

The alternative \code{method = "lmer"} fits the broken stick model by
\code{\link[lme4:lmer]{lme4::lmer()}}. With this method, the variance-covariance matrix can only be
unstructured. This estimate may be unstable if the number of children is
small relative to the number of specified knots. The default setting
in \code{\link[lme4:lmerControl]{lme4::lmerControl()}} is  \code{check.nobs.vs.nRE= "stop"}. The
\verb{[set_control()]} function changes this to \code{check.nobs.vs.nRE= "warning"}
by default, since otherwise many broken stick models would not run at all.
The method throws warnings that estimates are not stable. It can be time
for models with many internal knots. Despite the warnings,
the results often look reasonable.

Diagnostics with \pkg{coda} and \pkg{lme4}: The function returns an object
of class \code{brokenstick}. For \code{method = "kr"} the list component named
\code{"mod"} contains a list of \code{mcmc} objects that can be further analysed with
\code{\link[coda:trellisplots]{coda::acfplot()}}, \code{\link[coda:autocorr]{coda::autocorr()}}, \code{\link[coda:crosscorr]{coda::crosscorr()}}, \code{\link[coda:cumuplot]{coda::cumuplot()}},
\code{\link[coda:densplot]{coda::densplot()}}, \code{\link[coda:effectiveSize]{coda::effectiveSize()}}, \code{\link[coda:geweke.plot]{coda::geweke.plot()}},
\code{\link[coda:raftery.diag]{coda::raftery.diag()}}, \code{\link[coda:traceplot]{coda::traceplot()}} and the usual \code{plot()}
and \code{summary()} functions. For \code{method = "lmer"} the list component named
\code{"mod"} contains an object of class \link[lme4:merMod-class]{lme4::merMod}. These model objects
are omitted in light \code{brokenstick} objects.
}
\note{
Note that automatic knot specification is data-dependent, and may not reproduce
on other data. Likewise, knots specified via \code{k} are data-dependent and do not transfer
to other  data sets. Fixing the model requires specifying both \code{knots} and
\code{boundary}.
}
\examples{
\donttest{
data <- smocc_200[1:1198, ]

# using kr method, default
f1 <- brokenstick(hgt_z ~ age | id, data, knots = 0:3, seed = 123)
plot(f1, data, n_plot = 9)

# study sampling behaviour of the sigma2 parameter with coda
library("coda")
plot(f1$mod$sigma2)
acfplot(f1$mod$sigma2)

# using lmer method
f2 <- brokenstick(hgt_z ~ age | id, data, knots = 0:3, method = "lmer")
plot(f2, data, n_plot = 9)

# drill down into merMod object with standard diagnostics in lme4
summary(f2$mod)
plot(f2$mod)

# a model with more knots
knots <- round(c(0, 1, 2, 3, 6, 9, 12, 15, 18, 24, 36) / 12, 4)

# method kr takes about 2 seconds
f3 <- brokenstick(hgt_z ~ age | id, data, knots, seed = 222)
plot(f3, data, n_plot = 9)

# method lmer takes about 40 seconds
f4 <- brokenstick(hgt_z ~ age | id, data, knots, method = "lmer")
plot(f4, data, n_plot = 9)
}
}
