% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/variable_analysis.R
\name{variable_analysis}
\alias{variable_analysis}
\title{Function to evaluate relative importance of each variable.}
\usage{
variable_analysis(
  model,
  pts_occ,
  pts_occ_test = NULL,
  variables,
  shap_nsim = 100,
  visualize = FALSE,
  seed = 10
)
}
\arguments{
\item{model}{(\code{isolation_forest}) The extended isolation forest SDM. It could be
the item \code{model} of \code{POIsotree} made by function \code{\link{isotree_po}}.}

\item{pts_occ}{(\code{sf}) The \code{sf} style table that
include training occurrence locations.}

\item{pts_occ_test}{(\code{sf}, or \code{NULL}) The \code{sf} style
table that include occurrence locations of test.
If \code{NULL}, it would be set the same as \code{var_occ}. The default is \code{NULL}.}

\item{variables}{(\code{stars}) The \code{stars} of environmental variables. It should have
multiple \code{attributes} instead of \code{dims}. If you have \code{raster} object instead, you
could use \code{\link{st_as_stars}} to convert it to \code{stars} or use
\code{\link{read_stars}} directly read source data as a \code{stars}.}

\item{shap_nsim}{(\code{integer}) The number of Monte Carlo repetitions in SHAP
method to use for estimating each Shapley value. See details in documentation of
function \code{\link{explain}} in package \code{fastshap}.}

\item{visualize}{(\code{logical}) If \code{TRUE}, plot the analysis figures.
The default is \code{FALSE}.}

\item{seed}{(\code{integer}) The seed for any random progress. The default is \code{10L}.}
}
\value{
(\code{VariableAnalysis}) A list of
\itemize{
\item{variables (\code{vector} of \code{character}) The names of environmental variables}
\item{pearson_correlation (\code{tibble}) A table of Jackknife test based on Pearson correlation}
\item{full_AUC_ratio (\code{tibble}) A table of AUC ratio of training and test dataset using all variables,
that act as references for Jackknife test}
\item{AUC_ratio (\code{tibble}) A table of Jackknife test based on AUC ratio}
\item{SHAP (\code{tibble}) A table of Shapley values of training and test dataset separately}
}
}
\description{
Evaluate relative importance of each variable within the model
using the following methods:
\itemize{
\item{Jackknife test based on AUC ratio and Pearson correlation between the
result of model using all variables}
\item{SHapley Additive exPlanations (SHAP) according to Shapley values}}
}
\details{
\bold{Jackknife test} of variable importance is reflected as the decrease
in a model performance when an environmental variable is used singly or is
excluded from the environmental variable pool. In this function, we used
Pearson correlation and AUC ratio.

\bold{Pearson correlation} is the correlation between the predictions generated by
different variable importance evaluation methods and the predictions generated
by the full model as the assessment of mode performance.

The area under the ROC curve (AUC) is a threshold-independent evaluator of
model performance, which needs both presence and absence data. A ROC curve is
generated by plotting the proportion of correctly predicted presence on the
y-axis against 1 minus the proportion of correctly predicted absence on x-axis
for all thresholds. Multiple approaches have been used to evaluate accuracy of
presence-only models. Peterson et al. (2008) modified AUC by plotting the
proportion of correctly predicted presence against the proportion of
presences falling above a range of thresholds against the proportion of
cells of the whole area falling above the range of thresholds. This is the
so called \bold{AUC ratio} that is used in this package.

\bold{SHapley Additive exPlanations (SHAP)} uses Shapley values to evaluate the variable importance. The
larger the absolute value of Shapley value, the more important this variable is.
Positive Shapley values mean positive affect, while negative Shapely values mean
negative affect. Please check references for more details if you are interested in.
}
\examples{
\donttest{
# Using a pseudo presence-only occurrence dataset of
# virtual species provided in this package
library(dplyr)
library(sf)
library(stars)
library(itsdm)

data("occ_virtual_species")
obs_df <- occ_virtual_species \%>\% filter(usage == "train")
eval_df <- occ_virtual_species \%>\% filter(usage == "eval")
x_col <- "x"
y_col <- "y"
obs_col <- "observation"

# Format the observations
obs_train_eval <- format_observation(
  obs_df = obs_df, eval_df = eval_df,
  x_col = x_col, y_col = y_col, obs_col = obs_col,
  obs_type = "presence_only")

env_vars <- system.file(
  'extdata/bioclim_tanzania_10min.tif',
  package = 'itsdm') \%>\% read_stars() \%>\%
  slice('band', c(1, 5, 12, 16))

# With imperfect_presence mode,
mod <- isotree_po(
  obs_mode = "imperfect_presence",
  obs = obs_train_eval$obs,
  obs_ind_eval = obs_train_eval$eval,
  variables = env_vars, ntrees = 10,
  sample_size = 0.8, ndim = 2L,
  seed = 123L, nthreads = 1,
  response = FALSE,
  spatial_response = FALSE,
  check_variable = FALSE)

var_analysis <- variable_analysis(
  model = mod$model,
  pts_occ = mod$observation,
  pts_occ_test = mod$independent_test,
  variables = mod$variables)
plot(var_analysis)
}

}
\references{
\itemize{
\item{Peterson,
A. Townsend, Monica Papeş, and Jorge Soberón. "Rethinking receiver operating
characteristic analysis applications in ecological niche modeling."
\emph{Ecological modelling} 213.1 (2008): 63-72.\doi{10.1016/j.ecolmodel.2007.11.008}}
\item{Strumbelj, Erik,
and Igor Kononenko. "Explaining prediction models and individual predictions
with feature contributions." \emph{Knowledge and information systems}
41.3 (2014): 647-665.\doi{10.1007/s10115-013-0679-x}}
\item{\href{http://proceedings.mlr.press/v119/sundararajan20b.html}{Sundara
rajan, Mukund, and Amir Najmi. "The many Shapley values for model explanation
." \emph{International Conference on Machine Learning}. PMLR, 2020.}}
\item{\url{https://github.com/bgreenwell/fastshap}}
\item{\url{https://github.com/slundberg/shap}}
}
}
\seealso{
\code{\link{plot.VariableAnalysis}}, \code{\link{print.VariableAnalysis}}
\code{\link{explain}} in \code{fastshap}
}
