% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/create_pargasite_data.R
\name{create_pargasite_data}
\alias{create_pargasite_data}
\title{Create a data cube for air pollutant levels covering the conterminous United
States}
\usage{
create_pargasite_data(
  pollutant = c("CO", "SO2", "NO2", "Ozone", "PM2.5", "PM10"),
  data_field = c("NAAQS_statistic", "arithmetic_mean"),
  event_filter = c("Events Included", "Events Excluded", "Concurred Events Excluded"),
  year,
  by_month = FALSE,
  cell_size = 10000,
  nmax = Inf,
  aqs_email = get_aqs_email(),
  aqs_key = get_aqs_key(),
  download_chunk_size = c("2-week", "month")
)
}
\arguments{
\item{pollutant}{A string specifying an air pollutant to create a raster
data cube. Must be one of CO2, SO2, NO2, Ozone, PM2.5 and PM10.}

\item{data_field}{A vector of strings specifying whether which data fields
are used to summarize the data. Must be either 'NAAQS statistic',
'arithmetic_mean', or both. 'NAAQS_statistic' try to chooses an
appropriate field based on National Ambient Air Quality Standards (NAAQS)
in the AQS yearly data (e.g, for CO 1-hour average, 'second_max_value'
would be chosen). 'arithmetic_mean' represents the measure of central
tendency in the yearly data. Ignored when \code{by_month = TRUE}.}

\item{event_filter}{A vector of strings indicating whether data measured
during exceptional events are included in the summary. 'Events Included'
means that events occurred and the data from theme is included in the
summary. 'Events Excluded' means that events occurred but data from them
is excluded from the summary. 'Concurred Events Excluded' means that
events occurred but only EPA concurred exclusions are removed from the
summary. If multiple values are specified, pollutant levels for each
filter are stored in \code{event} dimension in the resulting output.}

\item{year}{A vector of 4-digit numeric values specifying years to retrieve
pollutant levels.}

\item{by_month}{A logical value indicating whether data summarized at
monthly level instead of yearly level.}

\item{cell_size}{A numeric value specifying a cell size of grid cells in
meters.}

\item{nmax}{An integer value specifying the number of nearest observations
that should be used for spatial interpolation.}

\item{aqs_email}{A string specifying the registered email for AQS API
service.}

\item{aqs_key}{A string specifying the registered key for AQS API service.}

\item{download_chunk_size}{A string specifying a chunk size for AQS API
daily data download to prevent an unexpected server timeout error. Ignored
when \code{by_month = FALSE}.}
}
\value{
A stars object containing the interpolated pollutant levels over
CONUS.
}
\description{
A function to create a raster-based pollutant concentration input for
pargasite's shiny application. It downloads pollutant data via the
Environmental Protection Agency's (EPA) Air Quality System (AQS) API
service, filters the data by exceptional event (e.g., wildfire) status, and
performs the inverse distance weighted (IDW) interpolation to estimate
pollutant concentrations covering the conterminous United States (CONUS) at
user-defined time ranges.
}
\details{
By default, it returns yearly-summarized concentrations using AQS's annual
data but can also provide monthly-summarized concentrations by aggregating
AQS's daily data. Note that the function chooses an appropriate data field
for each pollutant to check the air quality status based on the National
Ambient Air Quality Standard (NAAQS) for yearly-summarized outputs as
follows
\itemize{
\item CO 1-hour: \code{second_max_value} field
\item CO 8-hour: \code{second_max_nonoverlap} field
\item SO2 1-hour: \code{ninety_ninth_percentile} field
\item NO2 1-hour: \code{nineth_eighth_percentile} field
\item NO2 Annual: \verb{arithmetic mean} field
\item Ozone 8-hour: \code{fourth_max_value} field
\item PM10 24-hour: \code{primary_exceedance_count} field
\item PM25 24-hour: \code{ninety_eighth_percentile} field
\item PM25 Annual: \code{arithmetic_mean} field
}

For monthly-summarized outputs, it uses the \code{arithmetic_mean} field of daily
data. Please check AQS API \code{metaData/fieldsByService} (see
\link[raqs:aqs_metadata]{raqs::metadata_fieldsbyservice}) and
\href{https://aqs.epa.gov/aqsweb/documents/AQS_Data_Dictionary.html}{AQS
data dictionary} for the details of field descriptions.

For spatial interpolation, the AQS data is projected to EPSG:6350 (NAD83
CONUS Albers), and thus, \code{cell_size} value is represented in meters (5,000
creates 5km x 5km grid). The smaller \code{cell_size}, the more processing time
is required.
}
\examples{
\dontrun{

## Set your AQS API key first using [raqs::set_aqs_user] to run the example.

## SO2 and CO concentrations through 2020 to 2022
so2 <- create_pargasite_data("SO2", "Events Included", year = 2020:2022)
co <- create_pargasite_data("CO", "Events Included", year = 2020:2022)

## Combine them; can combine other pollutant grids in the same way
pargasite_input <- c(so2, co)
}

}
