% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/lazyframe__lazy.R
\name{LazyFrame_class}
\alias{LazyFrame_class}
\title{Inner workings of the LazyFrame-class}
\value{
not applicable
}
\description{
The \code{LazyFrame}-class is simply two environments of respectively
the public and private methods/function calls to the polars rust side. The
instantiated \code{LazyFrame}-object is an \code{externalptr} to a lowlevel rust polars
LazyFrame  object. The pointer address is the only statefullness of the
LazyFrame object on the R side. Any other state resides on the rust side. The
S3 method \code{.DollarNames.LazyFrame} exposes all public \verb{$foobar()}-methods which
are callable onto the object.

Most methods return another \code{LazyFrame}-class instance or similar which allows
for method chaining. This class system in lack of a better name could be called
"environment classes" and is the same class system extendr provides, except
here there is both a public and private set of methods. For implementation
reasons, the private methods are external and must be called from
\code{.pr$LazyFrame$methodname()}. Also, all private methods must take
any self as an argument, thus they are pure functions. Having the private methods
as pure functions solved/simplified self-referential complications.

\code{DataFrame} and \code{LazyFrame} can both be said to be a \code{Frame}. To convert use
\code{DataFrame_object$lazy() -> LazyFrame_object} and \code{LazyFrame_object$collect() -> DataFrame_object}.
This is quite similar to the lazy-collect syntax of the dplyrpackage to
interact with database connections such as SQL variants. Most SQL databases
would be able to perform the same optimizations as polars such Predicate Pushdown
and Projection. However polars can interact and optimize queries with both
SQL DBs and other data sources such parquet files simultaneously. (#TODO
implement r-polars SQL ;).
}
\details{
Check out the source code in R/LazyFrame__lazy.R how public methods
are derived from private methods. Check out  extendr-wrappers.R to see the
extendr-auto-generated methods. These are moved to \code{.pr} and converted into
pure external functions in after-wrappers.R. In zzz.R (named zzz to be last
file sourced) the extendr-methods are removed and replaced by any function
prefixed \code{LazyFrame_}.
}
\examples{
# see all exported methods
ls(.pr$env$LazyFrame)

# see all private methods (not intended for regular use)
ls(.pr$LazyFrame)


## Practical example ##
# First writing R iris dataset to disk, to illustrte a difference
temp_filepath = tempfile()
write.csv(iris, temp_filepath, row.names = FALSE)

# Following example illustrates 2 ways to obtain a LazyFrame

# The-Okay-way: convert an in-memory DataFrame to LazyFrame

# eager in-mem R data.frame
Rdf = read.csv(temp_filepath)

# eager in-mem polars DataFrame
Pdf = pl$DataFrame(Rdf)

# lazy frame starting from in-mem DataFrame
Ldf_okay = Pdf$lazy()

# The-Best-Way:  LazyFrame created directly from a data source is best...
Ldf_best = pl$scan_csv(temp_filepath)

# ... as if to e.g. filter the LazyFrame, that filtering also caleld predicate will be
# pushed down in the executation stack to the csv_reader, and thereby only bringing into
# memory the rows matching to filter.
# apply filter:
filter_expr = pl$col("Species") == "setosa" # get only rows where Species is setosa
Ldf_okay = Ldf_okay$filter(filter_expr) # overwrite LazyFrame with new
Ldf_best = Ldf_best$filter(filter_expr)

# the non optimized plans are similar, on entire in-mem csv, apply filter
Ldf_okay$describe_plan()
Ldf_best$describe_plan()

# NOTE For Ldf_okay, the full time to load csv alrady paid when creating Rdf and Pdf

# The optimized plan are quite different, Ldf_best will read csv and perform filter simultaneously
Ldf_okay$describe_optimized_plan()
Ldf_best$describe_optimized_plan()


# To acquire result in-mem use $colelct()
Pdf_okay = Ldf_okay$collect()
Pdf_best = Ldf_best$collect()


# verify tables would be the same
all.equal(
  Pdf_okay$to_data_frame(),
  Pdf_best$to_data_frame()
)

# a user might write it as a one-liner like so:
Pdf_best2 = pl$scan_csv(temp_filepath)$filter(pl$col("Species") == "setosa")
}
\keyword{LazyFrame}
