% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/generate.R
\name{generate}
\alias{generate}
\title{Generate resamples, permutations, or simulations}
\usage{
generate(x, reps = 1, type = NULL, variables = !!response_expr(x), ...)
}
\arguments{
\item{x}{A data frame that can be coerced into a \link[tibble:tibble]{tibble}.}

\item{reps}{The number of resamples to generate.}

\item{type}{The method used to generate resamples of the observed
data reflecting the null hypothesis. Currently one of
\code{"bootstrap"}, \code{"permute"}, or \code{"draw"} (see below).}

\item{variables}{If \code{type = "permute"}, a set of unquoted column names in the
data to permute (independently of each other). Defaults to only the
response variable. Note that any derived effects that depend on these
columns (e.g., interaction effects) will also be affected.}

\item{...}{Currently ignored.}
}
\value{
A tibble containing \code{reps} generated datasets, indicated by the
\code{replicate} column.
}
\description{
Generation creates a simulated distribution from \code{specify()}.
In the context of confidence intervals, this is a bootstrap distribution
based on the result of \code{specify()}. In the context of hypothesis testing,
this is a null distribution based on the result of \code{specify()} and
\verb{hypothesize().}

Learn more in \code{vignette("infer")}.
}
\section{Generation Types}{


The \code{type} argument determines the method used to create the null
distribution.

\itemize{
\item \code{bootstrap}: A bootstrap sample will be drawn for each replicate,
where a sample of size equal to the input sample size is drawn (with
replacement) from the input sample data.
\item \code{permute}: For each replicate, each input value will be randomly
reassigned (without replacement) to a new output value in the sample.
\item \code{draw}: A value will be sampled from a theoretical distribution
with parameter \code{p} specified in \code{\link[=hypothesize]{hypothesize()}} for each replicate. This
option is currently only applicable for testing on one proportion. This
generation type was previously called \code{"simulate"}, which has been
superseded.
}
}

\section{Reproducibility}{
When using the infer package for research, or in other cases when exact
reproducibility is a priority, be sure the set the seed for R’s random
number generator. infer will respect the random seed specified in the
\code{set.seed()} function, returning the same result when \code{generate()}ing
data given an identical seed. For instance, we can calculate the
difference in mean \code{age} by \code{college} degree status using the \code{gss}
dataset from 10 versions of the \code{gss} resampled with permutation using
the following code.

\if{html}{\out{<div class="sourceCode r">}}\preformatted{set.seed(1)

gss \%>\%
  specify(age ~ college) \%>\%
  hypothesize(null = "independence") \%>\%
  generate(reps = 5, type = "permute") \%>\%
  calculate("diff in means", order = c("degree", "no degree"))
}\if{html}{\out{</div>}}

\if{html}{\out{<div class="sourceCode">}}\preformatted{## Response: age (numeric)
## Explanatory: college (factor)
## Null Hypothesis: independence
## # A tibble: 5 × 2
##   replicate   stat
##       <int>  <dbl>
## 1         1 -0.531
## 2         2 -2.35 
## 3         3  0.764
## 4         4  0.280
## 5         5  0.350
}\if{html}{\out{</div>}}

Setting the seed to the same value again and rerunning the same code
will produce the same result.

\if{html}{\out{<div class="sourceCode r">}}\preformatted{# set the seed
set.seed(1)

gss \%>\%
  specify(age ~ college) \%>\%
  hypothesize(null = "independence") \%>\%
  generate(reps = 5, type = "permute") \%>\%
  calculate("diff in means", order = c("degree", "no degree"))
}\if{html}{\out{</div>}}

\if{html}{\out{<div class="sourceCode">}}\preformatted{## Response: age (numeric)
## Explanatory: college (factor)
## Null Hypothesis: independence
## # A tibble: 5 × 2
##   replicate   stat
##       <int>  <dbl>
## 1         1 -0.531
## 2         2 -2.35 
## 3         3  0.764
## 4         4  0.280
## 5         5  0.350
}\if{html}{\out{</div>}}

Please keep this in mind when writing infer code that utilizes
resampling with \code{generate()}.
}

\examples{
# generate a null distribution by taking 200 bootstrap samples
gss \%>\%
 specify(response = hours) \%>\%
 hypothesize(null = "point", mu = 40) \%>\%
 generate(reps = 200, type = "bootstrap")

# generate a null distribution for the independence of
# two variables by permuting their values 200 times
gss \%>\%
 specify(partyid ~ age) \%>\%
 hypothesize(null = "independence") \%>\%
 generate(reps = 200, type = "permute")

# generate a null distribution via sampling from a
# binomial distribution 200 times
gss \%>\%
specify(response = sex, success = "female") \%>\%
  hypothesize(null = "point", p = .5) \%>\%
  generate(reps = 200, type = "draw") \%>\%
  calculate(stat = "z")

# more in-depth explanation of how to use the infer package
\dontrun{
vignette("infer")
}

}
\seealso{
Other core functions: 
\code{\link{calculate}()},
\code{\link{hypothesize}()},
\code{\link{specify}()}
}
\concept{core functions}
