% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/TreeModelsAllSteps.R
\name{TreeModelsAllSteps}
\alias{TreeModelsAllSteps}
\title{Data Partition and Tree-based Model Training}
\usage{
TreeModelsAllSteps(
  data = NULL,
  proportion = 0.7,
  seed = 2022,
  methodlist = c("dt", "rf", "gbm"),
  iternumber = 10,
  dt.gridsearch = NULL,
  rf.gridsearch = NULL,
  gbm.gridsearch = NULL,
  checkprogress = FALSE
)
}
\arguments{
\item{data}{A \code{data.frame} that contains the study’s features and the outcome variable.
Please name the outcome variable as "perf".}

\item{proportion}{A numeric value for the proportion of data to be put into model training. Default is set to 0.7.}

\item{seed}{A numeric value for set.seed. It is set to be 2022 by default.}

\item{methodlist}{A list of the tree-based methods to model. The default is methodlist = c("dt", "rf", "gbm").}

\item{iternumber}{A numeric value for the number of resampling iterations/number of folds for the  cross-validation scheme.}

\item{dt.gridsearch}{A \code{data.frame} of the tuning grid, which allows for specifying parameters for decision tree model.}

\item{rf.gridsearch}{A \code{data.frame} of the tuning grid, which allows for specifying parameters for random forest model.}

\item{gbm.gridsearch}{A \code{data.frame} of the tuning grid, which allows for specifying parameters for gradient boosting model.}

\item{checkprogress}{Logical. Print the modeling progress if it is TRUE. The default is FALSE.}
}
\value{
This function returns three lists:

DataPartition The partitioned datasets: training (cv_train) and testing (cv_test).

ModelObject An object with results from selected models

SummaryReport A \code{data.frame} with the summary of model parameters. The summary report is shown automatically in the output.
}
\description{
Data Partition and Tree-based Model Training
}
\details{
This function performs all the steps of a predictive analysis. First, the data is partitioned in the training and testing datasets using a stratified selection by the outcome variable as performed by the createDataPartition function from the caret package. Then, the selected classifiers are used for modeling the training dataset under a cross-validation scheme. Users have the possibility to choose which model they want to compare by specifying it on the \code{methodlist} argument. The caretEnsemble package is used in the modeling process to ensure that all models follow the same resampling procedures. ROC is used to select the optimal model for each tree-based method using the largest value. Finally, a summary report is displayed.
}
\examples{
\donttest{
cp025q01.wgt <- cp025q01.wgt[,-14]
colnames(cp025q01.wgt)[14] <- "perf"

ensemblist <- TreeModelsAllSteps(data = cp025q01.wgt,
checkprogress = TRUE)

ensemblist <- TreeModelsAllSteps(data = cp025q01.wgt,
methodlist = c("dt", "gbm"), checkprogress = TRUE)

ensemblist <- TreeModelsAllSteps(data = cp025q01.wgt,
methodlist = c("rf"),
rf.gridsearch = data.frame(mtry = 2, splitrule = "gini", min.node.size = 1),
checkprogress = TRUE)
}
}
