% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/rank2map.r
\name{rank2map}
\alias{rank2map}
\title{Convert SNP Ranks To Windows Corresponding to Mapping Distance}
\usage{
rank2map(includedSites, ChosenSites = "all", windowSize = 1e+07, nCores = 1)
}
\arguments{
\item{includedSites}{A character path to a file with columns \code{CHROM} and \code{POS}.}

\item{ChosenSites}{A logical vector indicating which sites are to be included in the
analysis.}

\item{windowSize}{A numeric window size for metric conversion in base-pairs.}

\item{nCores}{A numeric number of cores to be used for parallelisation. Must be
\code{nCores = 1} on Windows.}
}
\value{
A two-column matrix with the number of rows corresponding to the number of
\code{ChosenSites}, indicating start and end indices of adjacent markers that are
within an interval of length \code{windowSize} centered on the specific marker.
}
\description{
This function estimates positions of ordered single nucleotide polymorphisms (SNPs) that correspond
to a window spanning a user-defined distance in the SNP positions mapped to a reference.
Each window is centered at the SNP mapped position.
Conversion of a SNP rank position metric to a mapped position metric is useful for
kernel smoothing of the \code{diem}
output state along a genomic sequence.
}
\details{
Single nucleotide polymorphisms (SNPs) tend to be spread across a genome randomly.
To facilitate interpretation of the \code{diem} output, the marker states should be
assessed on the metric of their position along chromosomes (contigs). The windows for
kernel smoothing might contain a variable number of markers. This function estimates
which markers should be assessed together given their proximity on a chromosome.

Values in \code{includedSites} are in essence SNP positions in BED format with a header.
The \code{includedSites} file should ideally be generated by
\link{vcf2diem} to ensure congruence across all analyses.

The function reads SNP positions from the specified BED-like file and divides the
genome into segments based on chromosomes. Each segment is then processed to identify
genomic windows encompassing each SNP, considering the specified window size. This
process is parallelized to enhance performance, and each SNP is considered within
its chromosomal context to ensure accurate window placement.

Minimum value of \code{windowSize} is equal to 3, but in genomic data evaluations, window
size should be at least two orders of magnitude larger. A good approximation of a
useful minimum window size is $(genome size) / ((number of SNPSs) / 2)$. Throughout the
diemr package, \code{windowSize} refers to the genomic context of the respective SNP
that the user wishes to consider when smoothing over the polarized genomic states.
}
\note{
The unit of parallelization when using \code{nCores > 1} is set per chromosome.
This may differ from the parallelization approach used in \link{diem}, where processing
of compartment files is parallelized. Note that while compartment files can correspond
to chromosomes, this is not necessarily the case.
}
\examples{
 \dontrun{
 # Run this example in a working directory with write permissions
 myo <- system.file("extdata", "myotis.vcf", package = "diemr")
 vcf2diem(myo, "myo")
 rank2map("myo-includedSites.txt", windowSize = 50)
 } 
}
\seealso{
\link{smoothPolarizedGenotypes}
}
\author{
Natalia Martinkova

Filip Jagos \href{mailto:521160@mail.muni.cz}{521160@mail.muni.cz}
}
