% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/calibrate_to_sample.R
\name{calibrate_to_sample}
\alias{calibrate_to_sample}
\title{Sample-based Calibration with Replicates}
\usage{
calibrate_to_sample(
  primary_rep_design,
  control_rep_design,
  cal_formula,
  calfun = survey::cal.linear,
  bounds = list(lower = -Inf, upper = Inf),
  verbose = FALSE,
  maxit = 50,
  epsilon = 1e-07,
  variance = NULL,
  control_col_matches = NULL
)
}
\arguments{
\item{primary_rep_design}{A replicate design object for the primary survey, created with either the \code{survey} or \code{srvyr} packages.}

\item{control_rep_design}{A replicate design object for the control survey.}

\item{cal_formula}{A formula listing the variables to use for calibration.
All of these variables must be included in both \code{primary_rep_design} and \code{control_rep_design}.}

\item{calfun}{A calibration function from the \code{survey} package,
such as \link[survey]{cal.linear}, \link[survey]{cal.raking}, or \link[survey]{cal.logit}.
Use \code{cal.linear} for ordinary post-stratification, and \code{cal.raking} for raking.
See \link[survey]{calibrate} for additional details.}

\item{bounds}{Parameter passed to \link[survey]{grake} for calibration. See \link[survey]{calibrate} for details.}

\item{verbose}{Parameter passed to \link[survey]{grake} for calibration. See \link[survey]{calibrate} for details.}

\item{maxit}{Parameter passed to \link[survey]{grake} for calibration. See \link[survey]{calibrate} for details.}

\item{epsilon}{Parameter passed to \link[survey]{grake} for calibration. \cr
After calibration, the absolute difference between each calibration target and the calibrated estimate
will be no larger than \code{epsilon} times (1 plus the absolute value of the target).
See \link[survey]{calibrate} for details.}

\item{variance}{Parameter passed to \link[survey]{grake} for calibration. See \link[survey]{calibrate} for details.}

\item{control_col_matches}{Optional parameter to specify which control survey replicate
is matched to each primary survey replicate. If the \eqn{i-th} entry of \code{control_col_matches}
equals \eqn{k}, then replicate \eqn{i} in \code{primary_rep_design} is matched
to replicate \eqn{k} in \code{control_rep_design.}
Entries of \code{NA} denote a primary survey replicate not matched to any control survey replicate.
If this parameter is not used, matching is done at random.}
}
\value{
A replicate design object, with full-sample weights calibrated to totals from \code{control_rep_design},
and replicate weights adjusted to account for variance of the control totals.
If \code{primary_rep_design} had fewer columns of replicate weights than \code{control_rep_design},
then the number of replicate columns and the length of \code{rscales} will be increased by a multiple \code{k},
and the \code{scale} will be updated by dividing by \code{k}. \cr \cr
The element \code{control_column_matches} indicates, for each replicate column of the calibrated primary survey,
which column of replicate weights it was matched to from the control survey.
Columns which were not matched to control survey replicate column are indicated by \code{NA}. \cr \cr
The element \code{degf} will be set to match that of the primary survey
to ensure that the degrees of freedom are not erroneously inflated by
potential increases in the number of columns of replicate weights.
}
\description{
Calibrate the weights of a primary survey to match estimated totals from a control survey,
using adjustments to the replicate weights to account for the variance of the estimated control totals.
Both surveys must have replicate weights.
The adjustments to replicate weights are conducted using the method proposed by Opsomer and Erciulescu (2021).
This method can be used to implement general calibration as well as post-stratification or raking specifically
(see the details for the \code{calfun} parameter).
}
\details{
With the Opsomer-Erciulescu method, each column of replicate weights from the control survey
is randomly matched to a column of replicate weights from the primary survey,
and then the column from the primary survey is calibrated to control totals estimated by
perturbing the control sample's full-sample estimates using the estimates from the
matched column of replicate weights from the control survey.
\cr \cr
If there are fewer columns of replicate weights in the control survey than in the primary survey,
then not all primary replicate columns will be matched to a replicate column from the control survey. \cr

If there are more columns of replicate weights in the control survey than in the primary survey,
then the columns of replicate weights in the primary survey will be duplicated \code{k} times, where \code{k} is the smallest
positive integer such that the resulting number of columns of replicate weights for the primary survey is greater than or equal
to the number of columns of replicate weights in the control survey. \cr

Because replicate columns of the control survey are matched \emph{at random} to primary survey replicate columns,
there are multiple ways to ensure that this matching is reproducible.
The user can either call \link[base]{set.seed} before using the function,
or supply a mapping to the argument \code{control_col_matches}.
}
\section{Syntax for Common Types of Calibration}{

For ratio estimation with an auxiliary variable \code{X},
use the following options: \cr
  - \code{cal_formula = ~ -1 + X} \cr
  - \code{variance = 1}, \cr
  - \code{cal.fun = survey::cal.linear}

For post-stratification, use the following option:

  - \code{cal.fun = survey::cal.linear}

For raking, use the following option:

  - \code{cal.fun = survey::cal.raking}
}

\examples{
\donttest{

# Load example data for primary survey ----

  data(api)

  primary_survey <- svydesign(id=~dnum, weights=~pw, data=apiclus1, fpc=~fpc) |>
    as.svrepdesign(type = "JK1")

# Load example data for control survey ----

  control_survey <- svydesign(id = ~ 1, fpc = ~fpc, data = apisrs) |>
    as.svrepdesign(type = "JK1")

# Calibrate totals for one categorical variable and one numeric ----

  calibrated_rep_design <- calibrate_to_sample(
    primary_rep_design = primary_survey,
    control_rep_design = control_survey,
    cal_formula = ~ stype + enroll,
  )

# Inspect estimates before and after calibration ----

  ##_ For the calibration variables, estimates and standard errors
  ##_ from calibrated design will match those of the control survey

    svytotal(x = ~ stype + enroll, design = primary_survey)
    svytotal(x = ~ stype + enroll, design = control_survey)
    svytotal(x = ~ stype + enroll, design = calibrated_rep_design)

  ##_ Estimates from other variables will be changed as well

    svymean(x = ~ api00 + api99, design = primary_survey)
    svymean(x = ~ api00 + api99, design = control_survey)
    svymean(x = ~ api00 + api99, design = calibrated_rep_design)

# Inspect weights before and after calibration ----

  summarize_rep_weights(primary_survey, type = 'overall')
  summarize_rep_weights(calibrated_rep_design, type = 'overall')

# For reproducibility, specify how to match replicates between surveys ----

  column_matching <- calibrated_rep_design$control_col_matches
  print(column_matching)

  calibrated_rep_design <- calibrate_to_sample(
    primary_rep_design = primary_survey,
    control_rep_design = control_survey,
    cal_formula = ~ stype + enroll,
    control_col_matches = column_matching
  )
}
}
\references{
Opsomer, J.D. and A. Erciulescu (2021).
"Replication variance estimation after sample-based calibration."
\strong{Survey Methodology}, \emph{47}: 265-277.
}
\seealso{
[calibrate_to_estimate()] as an alternative
when the control survey's data are not available or
it doesn't have replicate weights,
but an estimate and its variance-covariance are available.
}
