% Generated by roxygen2: do not edit by hand
% Please edit documentation in R/mbo_defaults.R
\name{default_surrogate}
\alias{default_surrogate}
\title{Default Surrogate}
\usage{
default_surrogate(
  instance,
  learner = NULL,
  n_learner = NULL,
  force_random_forest = FALSE
)
}
\arguments{
\item{instance}{(\link[bbotk:OptimInstance]{bbotk::OptimInstance})\cr
An object that inherits from \link[bbotk:OptimInstance]{bbotk::OptimInstance}.}

\item{learner}{(\code{NULL} | \link[mlr3:Learner]{mlr3::Learner}).
If specified, this learner will be used instead of the defaults described above.}

\item{n_learner}{(\code{NULL} | \code{integer(1)}).
Number of learners to be considered in the construction of the \link{Surrogate}.
If not specified will be based on the number of objectives as stated by the instance.}

\item{force_random_forest}{(\code{logical(1)}).
If \code{TRUE}, a random forest is constructed even if the parameter space is numeric-only.}
}
\value{
\link{Surrogate}
}
\description{
This is a helper function that constructs a default \link{Surrogate} based on properties of the
\link[bbotk:OptimInstance]{bbotk::OptimInstance}.

For numeric-only (including integers) parameter spaces without any dependencies a Gaussian Process is constricted via
\code{\link[=default_gp]{default_gp()}}.
For mixed numeric-categorical parameter spaces, or spaces with conditional parameters a random forest is constructed via
\code{\link[=default_rf]{default_rf()}}.

In any case, learners are encapsulated using \dQuote{"evaluate"}, and a fallback learner is set,
in cases where the surrogate learner errors.
Currently, the following learner is used as a fallback:
\code{lrn("regr.ranger", num.trees = 10L, keep.inbag = TRUE, se.method = "jack")}.

If additionally dependencies are present in the parameter space, inactive conditional parameters
are represented by missing \code{NA} values in the training design data.
We simply handle those with an imputation method, added to the random forest, more
concretely we use \code{po("imputesample")} (for logicals) and \code{po("imputeoor")} (for anything else) from
package \CRANpkg{mlr3pipelines}.
Characters are always encoded as factors via \code{po("colapply")}.
Out of range imputation makes sense for tree-based methods and is usually hard to beat, see Ding et al. (2010).
In the case of dependencies, the following learner is used as a fallback:
\code{lrn("regr.featureless")}.

If \code{n_learner} is \code{1}, the learner is wrapped as a \link{SurrogateLearner}.
Otherwise, if \code{n_learner} is larger than \code{1}, multiple deep clones of the learner are wrapped as a \link{SurrogateLearnerCollection}.
}
\references{
\itemize{
\item Ding, Yufeng, Simonoff, S J (2010).
\dQuote{An Investigation of Missing Data Methods for Classification Trees Applied to Binary Response Data.}
\emph{Journal of Machine Learning Research}, \bold{11}(1), 131--170.
}
}
\seealso{
Other mbo_defaults: 
\code{\link{default_acqfunction}()},
\code{\link{default_acqoptimizer}()},
\code{\link{default_gp}()},
\code{\link{default_loop_function}()},
\code{\link{default_result_assigner}()},
\code{\link{default_rf}()},
\code{\link{mbo_defaults}}
}
\concept{mbo_defaults}
