% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/describe_data.r
\name{describe_data}
\alias{describe_data}
\title{Calculate common descriptive statistics}
\usage{
describe_data(data, column, na.rm = TRUE, short = FALSE)
}
\arguments{
\item{data}{A data frame.}

\item{column}{An unquoted (numerical) column name from the data frame.}

\item{na.rm}{Logical. Should missing values (including NaN) be excluded in
calculating the descriptives? The default is TRUE.}

\item{short}{Logical. Should only a subset of descriptives be reported? If 
set to TRUE, only the N, M, and SD will be returned. The default is FALSE.}
}
\description{
\code{describe_data} returns a set of common descriptive statistics
(e.g., n, mean, sd) for numeric variables.
}
\details{
The data can be grouped using \code{dplyr::group_by} so that 
descriptives will be calculated for each group level.

When na.rm is set to FALSE, a percentage column will be added to the output
that contains the percentage of non-missing data.

Skew and kurtosis are based on the \code{skewness} and \code{kurtosis}
functions of the \code{moments} package (Komsta & Novomestky, 2015).

Percentages are calculated based on the total of non-missing observations. 
When na.rm is set to FALSE, percentages are based on the total of missing and
non-missing observations.
}
\examples{
# Load the dplyr package for access to the \%>\% operator and group_by()
library(dplyr)

# Inspect descriptives of the response column from the 'quote_source' data
# frame included in tidystats
describe_data(quote_source, response)

# Repeat the former, now for each level of the source column
quote_source \%>\%
  group_by(source) \%>\%
  describe_data(response)
  
# Only inspect the total N, mean, and standard deviation
quote_source \%>\%
  group_by(source) \%>\%
  describe_data(response, short = TRUE)

}
