% Generated by roxygen2 (4.0.2): do not edit by hand
\name{stabit}
\alias{stabit}
\alias{stabitCpp}
\title{Structural Matching Model to correct for sample selection bias}
\usage{
stabit(x, m.id = "m.id", g.id = "g.id", R = "R", selection = NULL,
  outcome = NULL, roommates = FALSE, simulation = "none", seed = 123,
  max.combs = Inf, method = "NTU", binary = FALSE, offsetOut = 0,
  offsetSel = 0, marketFE = FALSE, censored = 0, gPrior = FALSE,
  dropOnes = FALSE, interOut = 0, interSel = 0, niter = 10)
}
\arguments{
\item{x}{data frame with individual-level characteristics of all group members including
market- and group-identifiers.}

\item{m.id}{character string giving the name of the market identifier variable. Defaults to \code{"m.id"}.}

\item{g.id}{character string giving the name of the group identifier variable. Defaults to \code{"g.id"}.}

\item{R}{dependent variable in outcome equation. Defaults to \code{"R"}.}

\item{selection}{list containing variables and pertaining operators in the selection equation. The format is
\code{operation = "variable"}. See the Details and Examples sections.}

\item{outcome}{list containing variables and pertaining operators in the outcome equation. The format is
\code{operation = "variable"}. See the Details and Examples sections.}

\item{roommates}{logical: if \code{TRUE} data is assumed to come from a roomate game. This means that groups
are of size two and the model matrix is prepared for individual-level analysis (peer-effects estimation).
If \code{FALSE} (which is the default) data is assumed to come from a group/coalition formation game and
the model matrix is prepared for group-level analysis.}

\item{simulation}{should the values of dependent variables in selection and outcome equations be simulated?
Options are \code{"none"} for no simulation, \code{"NTU"} for non-transferable utility matching, \code{"TU"}
for transferable utility or \code{"random"} for random matching of individuals to groups.
Simulation settings are (i) all model coefficients set to \code{alpha=beta=1}; (ii) covariance between
error terms \code{delta=0.5}; (iii) error terms \code{eta} and \code{xi} are draws from a standard normal distribution.}

\item{seed}{integer setting the state for random number generation if \code{simulation=TRUE}.}

\item{max.combs}{integer (divisible by two) giving the maximum number of feasible groups to be used for
generating group-level characteristics.}

\item{method}{estimation method to be used. Either \code{"NTU"} or \code{"TU"} for selection correction using
non-transferable or transferable utility matching as selection rule; \code{"outcome"} for estimation of the
outcome equation only; or \code{"model.frame"} for no estimation.}

\item{binary}{logical: if \code{TRUE} outcome variable is taken to be binary; if \code{FALSE} outcome variable is taken to be continuous.}

\item{offsetOut}{vector of integers indicating the indices of columns in \code{X} for which coefficients should be forced to 1. Use 0 for none.}

\item{offsetSel}{vector of integers indicating the indices of columns in \code{W} for which coefficients should be forced to 1. Use 0 for none.}

\item{marketFE}{logical: if \code{TRUE} market-level fixed effects are used in outcome equation; if \code{FALSE} no market fixed effects are used.}

\item{censored}{draws of the \code{delta} parameter that estimates the covariation between the error terms in selection and outcome equation
are 0:not censored, 1:censored from below, 2:censored from above.}

\item{gPrior}{logical: if \code{TRUE} the g-prior (Zellner, 1986) is used for the variance-covariance matrix.}

\item{dropOnes}{logical: if \code{TRUE} one-group-markets are exluded from estimation.}

\item{interOut}{two-colum matrix indicating the indices of columns in \code{X} that should be interacted in estimation. Use 0 for none.}

\item{interSel}{two-colum matrix indicating the indices of columns in \code{W} that should be interacted in estimation. Use 0 for none.}

\item{niter}{number of iterations to use for the Gibbs sampler.}
}
\value{
\code{stabit} returns a list with the following items.
\item{model.list}{}
\item{model.frame}{}
\item{draws}{}
\item{coefs}{}
}
\description{
The function provides a Gibbs sampler for a structural matching model that corrects
for sample selection bias when the selection process is a one-sided matching game; that is,
group/coalition formation.

The input is individual-level data of all group members from one-sided matching marktes; that is,
from group/coalition formation games.

In a first step, the function generates a model matrix with characteristics of \emph{all feasible} groups
of the same size as the observed groups in the market.

For example, in the stable roommates problem with \eqn{n=4} students
\eqn{\{1,2,3,4\}}{{1,2,3,4}} sorting into groups of 2,
we have \eqn{{4 \choose 2}=6}{"4 choose 2" = 6} feasible groups: (1,2)(3,4) (1,3)(2,4)
(1,4)(2,3).

In the group formation problem with \eqn{n=6} students
\eqn{\{1,2,3,4,5,6\}}{{1,2,3,4,5,6}} sorting into groups of 3,
we have \eqn{{6 \choose 3}=20}{"6 choose 3" = 20} feasible groups.
For the same students sorting into groups of sizes 2 and 4,
we have \eqn{{6 \choose 2} + {6 \choose 4}=30}{"6 choose 2" + "6 choose 4" = 30} feasible groups.

The structural model consists of a selection and an outcome equation. The \emph{Selection Equation}
determines which matches are observed (\eqn{D=1}) and which are not (\eqn{D=0}).
\deqn{ \begin{array}{lcl}
       D &= & 1[V \in \Gamma] \\
       V &= & W\alpha + \eta
       \end{array}
     }{ D = 1[V in \Gamma] with V = W\alpha + \eta
     }
Here, \eqn{V} is a vector of latent valuations of \emph{all feasible} matches, ie observed and
unobserved, and \eqn{1[.]} is the Iverson bracket.
A match is observed if its match valuation is in the set of valuations \eqn{\Gamma}
that satisfy the equilibrium condition (see Klein, 2014). This condition differs for matching
games with transferable and non-transferable utility and can be specified using the \code{method}
argument.
The match valuation \eqn{V} is a linear function of \eqn{W}, a matrix of characteristics for
\emph{all feasible} groups, and \eqn{\eta}, a vector of random errors. \eqn{\alpha} is a paramter
vector to be estimated.

The \emph{Outcome Equation} determines the outcome for \emph{observed} matches. The dependent
variable can either be continuous or binary, dependent on the value of the \code{binary}
argument. In the binary case, the dependent variable \eqn{R} is determined by a threshold
rule for the latent variable \eqn{Y}.
\deqn{ \begin{array}{lcl}
       R &= & 1[Y > c] \\
       Y &= & X\beta + \epsilon
       \end{array}
     }{ R = 1[Y > c] with Y = X\beta + \epsilon
     }
Here, \eqn{Y} is a linear function of \eqn{X}, a matrix of characteristics for \emph{observed}
matches, and \eqn{\epsilon}, a vector of random errors. \eqn{\beta} is a paramter vector to
be estimated.

The structural model imposes a linear relationship between the error terms of both equations
as \eqn{\epsilon = \delta\eta + \xi}, where \eqn{\xi} is a vector of random errors and \eqn{\delta}
is the covariance paramter to be estimated. If \eqn{\delta} were zero, the marginal distributions
of \eqn{\epsilon} and \eqn{\eta} would be independent and the selection problem would vanish.
That is, the observed outcomes would be a random sample from the population of interest.
}
\details{
Operators for variable transformations in \code{selection} and \code{outcome} arguments.
\describe{
\item{\code{add}}{sum over all group members and divide by group size.}
\item{\code{int}}{sum over all possible two-way interactions \eqn{x*y} of group members
and divide by the number of those, given by \code{choose(n,2)}.}
\item{\code{ieq}}{sum over all possible two-way equality assertions \eqn{1[x=y]} and
divide by the number of those.}
\item{\code{ive}}{sum over all possible two-way interactions of vectors
of variables of group members and divide by number of those.}
\item{\code{inv}}{...}
\item{\code{dst}}{sum over all possible two-way distances between players and divide by
number of those, where distance is defined as \eqn{e^{-|x-y|}}{exp(-|x-y|)}.}
\item{\code{sel}}{for \code{roommates=TRUE} only: variable for individual (for peer effects estimation).}
\item{\code{oth}}{for \code{roommates=TRUE} only: variable for other in the group (for peer effects estimation).}
}
}
\section{Values of model.list}{

\describe{
\item{\code{D}}{vector that indicates -- for all feasible groups in the market -- whether a group is observed in the data \code{D=1} or not \code{D=0}.}
\item{\code{R}}{list of group-level outcome vectors for equilibrium groups.}
\item{\code{W}}{list with data.frame \code{W[[t]][G,]} containing characteristics of group \code{G} in market \code{t} (all feasible groups).}
\item{\code{X}}{list with data.frame \code{X[[t]][G,]} containing characteristics of group \code{G} in market \code{t} (equilibrium groups only).}
\item{\code{V}}{vector of group valuations for all feasible groups in the market.}
\item{\code{P}}{vector that gives for each group the index of the group comprised of residual individuals in the market (for 2-group markets).}
\item{\code{epsilon}}{if \code{simulation!="none"}, the errors in the outcome equation, given by \code{delta*eta + xi}.}
\item{\code{eta}}{if \code{simulation!="none"}, the standard normally distributed errors in the selection equation.}
\item{\code{xi}}{if \code{simulation!="none"}, the standard normally distributed component of the errors in the selection equation that is independent of \code{eta}.}
\item{\code{combs}}{partitions matrix that gives all feasible partitions of the market into groups of the observed sizes.}
\item{\code{E}}{matrix that gives the indices of equilibrium group members for each group in the market. Only differs from the first two rows in \code{combs} if \code{simulation!="none"}.}

\item{\code{sigmasquareximean}}{variance estimate of the error term \code{xi} in the outcome equation.}
}
}

\section{Values of model.frame}{

\describe{
\item{\code{SEL}}{data frame comprising variables in selection equation and number of observations equal to the number of feasible groups.}
\item{\code{OUT}}{data frame comprising variables in outcome equation and number of observations equal to the number of equilibrium groups.}
}
}

\section{Values of draws}{

\describe{
\item{\code{alphadraws}}{matrix of dimension \code{ncol(W)} \code{x} \code{niter} comprising all paramter draws for the selection equation.}
\item{\code{betadraws}}{matrix of dimension \code{ncol(X)} \code{x} \code{niter} comprising all paramter draws for the outcome equation.}
\item{\code{deltadraws}}{vector of length \code{niter} comprising all draws for the \code{delta} parameter.}
\item{\code{sigmasquarexidraws}}{.}
}
}

\section{Values of coefs}{

\describe{
\item{\code{eta}}{vector containing the mean of all \code{eta} draws for each observed group.}
\item{\code{alpha}}{matrix comprising the coefficient estimates of alpha and their standard errors.}
\item{\code{beta}}{matrix comprising the coefficient estimates of beta and their standard errors.}
\item{\code{delta}}{coefficient estimate of delta and its standard error.}
\item{\code{sigmasquarexi}}{variance estimate of the error term \code{xi} in the outcome equation and its standard error.}
}
}
\examples{
#########################################
## MODEL FRAMES (method="model.frame") ##
#########################################

## --- ROOMMATES GAME ---
## 1. Simulate one-sided matching data for 3 markets (m=3) with 3 groups
##    per market (gpm=3) and 2 individuals per group (ind=2)
 idata <- stabsim(m=3, ind=2, gpm=3)
## 2. Obtain the model frame
 s1 <- stabit(x=idata, selection = list(add="pi", ieq="wst"),
     outcome = list(add="pi", ieq="wst"),
     method="model.frame", simulation="TU", roommates=TRUE)

## --- GROUP/COALITION FORMATION (I) ---
## 1. Simulate one-sided matching data for 3 markets (m=3) with 2 groups
## per market (gpm=2) and 2 to 4 individuals per group (ind=2:4)
 idata <- stabsim(m=3, ind=2:4, gpm=2)
## 2. Obtain the model frame
 s2 <- stabit(x=idata, selection = list(add="pi", ieq="wst"),
      outcome = list(add="pi", ieq="wst"),
      method="model.frame", simulation="NTU", roommates=FALSE)

## --- GROUP/COALITION FORMATION (II) ---
\dontrun{
## 1. Load baac00 data from the Townsend Thai project
 data(baac00)
## 2. Obtain the model frame
 s3 <- stabit(x=baac00, selection = list(add="pi", int="pi", ieq="wst", ive="occ"),
      outcome = list(add="pi", int="pi", ieq="wst", ive="occ",
      add=c("loan_size","loan_size2","lngroup_agei")),
      method="model.frame", simulation="none")
}

###############################
## ESTIMATION (method="NTU") ##
###############################

\dontrun{
## --- SIMULATED EXAMPLE ---
## 1. Simulate one-sided matching data for 3 markets (m=3) with 2 groups
##    per market (gpm=2) and 2 to 4 individuals per group (ind=2:4)
 idata <- stabsim(m=3, ind=2:4, gpm=2)
## 2. Run Gibbs sampler
 fit1 <- stabit(x=idata, selection = list(add="pi",ieq="wst"),
        outcome = list(add="pi",ieq="wst"),
        method="NTU", simulation="NTU", binary=FALSE, niter=2000)
## 3. Get results
 names(fit1)

## --- REPLICATION, Klein (2014), Table 8 ---
## 1. Load data
 data(baac00)
## 2. Run Gibbs sampler
 fit2 <- stabit(x=baac00, selection = list(add="pi",int="pi",ive="occ",ieq="wst"),
        outcome = list(add="pi",int="pi",ive="occ",ieq="wst",
        add=c("loan_size","loan_size2","lngroup_agei")),
        method="NTU", binary=TRUE, gPrior=TRUE, marketFE=TRUE, niter=2000)
## 3. Get results
 names(fit2)
}
}
\author{
Thilo Klein
}
\references{
Klein, T. (2014). Stable matching in microcredit: Implications for market design & econometric analysis, PhD thesis,
\emph{University of Cambridge}.

Zellner, A. (1986). \emph{On assessing prior distributions and Bayesian regression analysis with g-prior distributions},
volume 6, pages 233--243. North-Holland, Amsterdam.
}
\keyword{regression}

