% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/add_inactive_replicates.R
\name{add_inactive_replicates}
\alias{add_inactive_replicates}
\title{Add inactive replicates to a survey design object}
\usage{
add_inactive_replicates(design, n_total, n_to_add, location = "last")
}
\arguments{
\item{design}{A survey design object, created with either the \code{survey} or \code{srvyr} packages.}

\item{n_total}{The total number of replicates
that the result should contain. If the design already contains \code{n_total}
replicates (or more), then no update is made.}

\item{n_to_add}{The number of additional replicates to add.
Can only use the \code{n_total} argument OR the \code{n_to_add} argument,
not both.}

\item{location}{Either \code{"first"}, \code{"last"} (the default), or \code{"random"}.
Specifies where the columns of new replicates should be located in the
matrix of replicate weights. Use \code{"first"}
to place new replicates first (i.e., in the leftmost part of the matrix),
\code{"last"} to place the new replicates last (i.e., in the rightmost part
of the matrix). Use \code{"random"} to intersperse the new replicates
in random column locations of the matrix; the original replicates will
still be in their original order.}
}
\value{
An updated survey design object, where the number of columns
of replicate weights has potentially increased. The increase only happens
if the user specifies the \code{n_to_add} argument instead of \code{n_total},
of if the user specifies \code{n_total} and \code{n_total} is less than the number
of columns of replicate weights that the design already had.
}
\description{
Adds inactive replicates to a survey design object. An inactive
replicate is a replicate that does not contribute to variance estimates but
adds to the matrix of replicate weights so that the matrix has the desired
number of columns. The new replicates' values are simply equal to the full-sample weights.
}
\section{Statistical Details}{

Inactive replicates are also sometimes referred to as "dead replicates",
for example in Ash (2014). The purpose of adding inactive replicates
is to increase the number of columns of replicate weights without impacting
variance estimates. This can be useful, for example, when combining data
from a survey across multiple years, where different years use different number
of replicates, but a consistent number of replicates is desired in the combined
data file.

Suppose the initial replicate design has \eqn{L} replicates, with
respective constants \eqn{c_k} for \eqn{k=1,\dots,L} used to estimate variance
with the formula
\deqn{v_{R} = \sum_{k=1}^L c_k\left(\hat{T}_y^{(k)}-\hat{T}_y\right)^2}
where \eqn{\hat{T}_y} is the estimate produced using the full-sample weights
and \eqn{\hat{T}_y^{(k)}} is the estimate from replicate \eqn{k}.

Inactive replicates are simply replicates that are exactly equal to the full sample:
that is, the replicate \eqn{k} is called "inactive" if its vector of replicate
weights exactly equals the full-sample weights. In this case, when using the formula
above to estimate variances, these replicates contribute nothing to the variance estimate.

If the analyst uses the variant of the formula above where the full-sample estimate
\eqn{\hat{T}_y} is replaced by the average replicate estimate (i.e., \eqn{L^{-1}\sum_{k=1}^{L}\hat{T}_y^{(k)}}),
then variance estimates will differ before vs. after adding the inactive replicates.
For this reason, it is strongly recommend to explicitly specify \code{mse=TRUE}
when creating a replicate design object in R with functions such as \code{svrepdesign()},
\code{as_bootstrap_design()}, etc. If working with an already existing replicate design,
you can update the \code{mse} option to \code{TRUE} simply by using code such as
\code{my_design$mse <- TRUE}.
}

\examples{
library(survey)
set.seed(2023)

# Create an example survey design object

  sample_data <- data.frame(
    PSU     = c(1,2,3)
  )

  survey_design <- svydesign(
    data = sample_data,
    ids = ~ PSU,
    weights = ~ 1
  )

  rep_design <- survey_design |>
    as.svrepdesign(type = "JK1", mse = TRUE)

# Inspect replicates before subsampling

  rep_design |> weights(type = "analysis")

# Inspect replicates after adding inactive replicates

  rep_design |>
    add_inactive_replicates(n_total = 5, location = "first") |>
    weights(type = "analysis")

  rep_design |>
    add_inactive_replicates(n_to_add = 2, location = "last") |>
    weights(type = "analysis")

  rep_design |>
    add_inactive_replicates(n_to_add = 5, location = "random") |>
    weights(type = "analysis")

}
\references{
Ash, S. (2014). "\emph{Using successive difference replication for estimating variances}."
\strong{Survey Methodology}, Statistics Canada, 40(1), 47–59.
}
