% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/snp_pca.R
\name{snp.pca}
\alias{snp.pca}
\title{Performs a Principal Component Analysis (PCA) based on a molecular matrix M}
\usage{
snp.pca(M = NULL, label = FALSE, ncp = 10, groups = NULL, ellipses = FALSE)
}
\arguments{
\item{M}{A matrix with SNP data of full form (\eqn{n \times p}), with \eqn{n}
individuals and \eqn{p} markers (default = \code{NULL}).}

\item{label}{If \code{TRUE} then includes in output individuals names (default = \code{FALSE}).}

\item{ncp}{The number of PC dimensions to be shown in the screeplot, and to provide
in the output data frame (default = \code{10}).}

\item{groups}{Specifies a vector of class factor that will be used to define different
colors for individuals in the PCA plot. It must be presented in the same order as the individuals
in the molecular \eqn{\boldsymbol{M}} matrix (default = \code{NULL}).}

\item{ellipses}{If \code{TRUE}, ellipses will will be drawn around each of the define levels in
\code{groups} (default = \code{FALSE}).}
}
\value{
A list with the following four elements:
\itemize{
\item{\code{eigenvalues}: a data frame with the eigenvalues and its variances associated with each dimension
including only the first \code{ncp} dimensions.}
\item{\code{pca.scores}: a data frame with scores (rotated observations on the new components) including
only the first \code{ncp} dimensions.}
\item{\code{plot.pca}: a scatterplot with the first two-dimensions (PC1 and PC2) and their scores.}
\item{\code{plot.scree}: a barchart with the percentage of variances explained by the \code{ncp} dimensions.}
}
}
\description{
Generates a PCA and summary statistics from a given molecular matrix
for population structure. Matrix
provided is of full form (\eqn{n \times p}), with n individuals and p markers. Individual and
marker names are assigned to \code{rownames} and \code{colnames}, respectively.
SNP data is coded as 0, 1, 2 (integers or decimal numbers). Missing values are
not accepted and these need to be imputed (see function \code{qc.filtering()}
for implementing mean imputation). There is additional output such as plots and
other data frames
to be used on other downstream analyses (such as GWAS).
}
\details{
It calls function \code{prcomp()} to generate the PCA and the
\code{factoextra} R package to extract and visualize results.
Methodology uses normalized allele frequencies as proposed by Patterson \emph{et al.} (2006).
}
\examples{
# Perform the PCA.
SNP_pca <- snp.pca(M = geno.apple, ncp = 10)
ls(SNP_pca)
SNP_pca$eigenvalues
head(SNP_pca$pca.scores)
SNP_pca$plot.pca
SNP_pca$plot.scree

# PCA plot by family (17 groups).
grp <- as.factor(pheno.apple$Family)
SNP_pca_grp <- snp.pca(M = geno.apple, groups = grp, label = FALSE)
SNP_pca_grp$plot.pca

}
\references{
Patterson N., Price A.L., and Reich, D. 2006. Population structure and eigenanalysis.
PLoS Genet 2(12):e190. doi:10.1371/journal.pgen.0020190
}
