% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/post-action-tailor.R
\name{add_tailor}
\alias{add_tailor}
\alias{remove_tailor}
\alias{update_tailor}
\title{Add a tailor to a workflow}
\usage{
add_tailor(x, tailor, ...)

remove_tailor(x)

update_tailor(x, tailor, ...)
}
\arguments{
\item{x}{A workflow}

\item{tailor}{A tailor created using \code{\link[tailor:tailor]{tailor::tailor()}}. The tailor
should not have been trained already with \code{\link[tailor:reexports]{tailor::fit()}}; workflows
will handle training internally.}

\item{...}{Not used.}
}
\value{
\code{x}, updated with either a new or removed tailor postprocessor.
}
\description{
\itemize{
\item \code{add_tailor()} specifies post-processing steps to apply through the
usage of a tailor.
\item \code{remove_tailor()} removes the tailor as well as any downstream objects
that might get created after the tailor is used for post-processing, such as
the fitted tailor.
\item \code{update_tailor()} first removes the tailor, then replaces the previous
tailor with the new one.
}
}
\section{Data Usage}{


While preprocessors and models are trained on data in the usual sense,
postprocessors are trained on \emph{predictions} on data. When a workflow
is fitted, the user typically supplies training data with the \code{data} argument.
When workflows don't contain a postprocessor that requires training,
users can pass all of the available data to the \code{data} argument to train the
preprocessor and model. However, in the case where a postprocessor must be
trained as well, allotting all of the available data to the \code{data} argument
to train the preprocessor and model would leave no data
to train the postprocessor with---if that were the case, workflows
would need to \code{predict()} from the preprocessor and model on the same \code{data}
that they were trained on, with the postprocessor then training on those
predictions. Predictions on data that a model was trained on likely follow
different distributions than predictions on unseen data; thus, workflows must
split up the supplied \code{data} into two training sets, where the first is used to
train the preprocessor and model and the second, called the "calibration set,"
is passed to that trained postprocessor and model to generate predictions,
which then form the training data for the postprocessor.

When fitting a workflow with a postprocessor that requires training
(i.e. one that returns \code{TRUE} in \code{.workflow_postprocessor_requires_fit(workflow)}),
users must pass two data arguments--the usual \code{fit.workflow(data)} will be
used to train the preprocessor and model while \code{fit.workflow(data_calibration)}
will be used to train the postprocessor.

In some situations, randomly splitting \code{fit.workflow(data)} (with
\code{rsample::initial_split()}, for example) is sufficient to prevent data
leakage. However, \code{fit.workflow(data)} could also have arisen as:

\if{html}{\out{<div class="sourceCode">}}\preformatted{boots <- rsample::bootstraps(some_other_data)
split <- rsample::get_rsplit(boots, 1)
data <- rsample::analysis(split)
}\if{html}{\out{</div>}}

In this case, some of the rows in \code{data} will be duplicated. Thus, randomly
allotting some of them to train the preprocessor and model and others to train
the preprocessor would likely result in the same rows appearing in both
datasets, resulting in the preprocessor and model generating predictions on
rows they've seen before. Similarly problematic situations could arise in the
context of other resampling situations, like time-based splits.
In general, \code{rsample::internal_calibration_split()} offers a way to prevent data
leakage when resampling. When workflows with postprocessors that require
training are passed to the tune package, this is handled internally.
}

\examples{
\dontshow{if (rlang::is_installed(c("tailor", "probably"))) (if (getRversion() >= "3.4") withAutoprint else force)(\{ # examplesIf}
library(tailor)
library(magrittr)

tailor <- tailor()
tailor_1 <- adjust_probability_threshold(tailor, .1)

workflow <- workflow() |>
  add_tailor(tailor_1)

workflow

remove_tailor(workflow)

update_tailor(workflow, adjust_probability_threshold(tailor, .2))
\dontshow{\}) # examplesIf}
}
