Standard optimality criteria (D, A, I, …) assume the model is correctly specified and aim to estimate its parameters as precisely as possible. KL-Optimality addresses a different question: given two competing models, where should observations be taken to tell them apart most easily?
The criterion is based on the Kullback-Leibler divergence between the reference model \(f_1\) and the best-fitting version of the rival \(f_2\):
\[\Phi_{\text{KL}}(\xi) = \int \min_{\beta_2} \text{KL}(f_1(y \mid x) \,\|\, f_2(y \mid x, \beta_2))\, \xi(dx).\]
The design \(\xi^*\) that maximises \(\Phi_{\text{KL}}\) places observations where the two models are most distinguishable on average.
Reference: López-Fidalgo, Tommasi & Trandafir (2007). An optimal experimental design criterion for discriminating between non-normal models. J. R. Stat. Soc. B, 69(2), 231–242.
opt_des(
"KL-Optimality",
model = <reference model formula>,
parameters = <reference parameter names>,
par_values = <nominal values>,
design_space = <scalar interval or named list>,
# rival specification:
rival_model = <rival formula>, # default: same as model
rival_params = <rival param names>, # default: same as parameters
rival_pars = <starting values for internal optimisation>,
rival_lower = <lower bounds for rival params>,
rival_upper = <upper bounds for rival params>,
# GLM family:
family = "Normal", # "Poisson", "Binomial", "Gamma"
phi = 1, # dispersion parameter
# alternative: supply a custom KL function
kl_fun = NULL
)The internal optimisation over rival parameters is done
automatically. The optimal rival values are stored as a hidden attribute
and shown by summary().
Context. Enzyme kinetics with one substrate. The reference model is Michaelis-Menten \(\mu(x) = V_{\max} x / (K_m + x)\) with \(V_{\max} = 2\), \(K_m = 1\). The rival is a linear rate \(\mu(x) = a x\), valid only when \(x \ll K_m\). At high concentrations MM saturates while the linear model keeps growing — that is where the models diverge most.
result_kl_mm <- opt_des(
"KL-Optimality",
model = y ~ Vmax * x / (Km + x),
parameters = c("Vmax", "Km"),
par_values = c(2, 1),
design_space = c(0.1, 5),
rival_model = y ~ a * x,
rival_params = c("a"),
rival_pars = c(1),
rival_lower = c(0.2),
rival_upper = c(2.5),
family = "Normal",
phi = 1
)
#>
#> ℹ Stop condition reached: difference between sensitivity and criterion < 1e-05
#>
⠙ Calculating optimal design 7 done (11/s) | 626ms
ℹ The lower bound for efficiency is 99.9957589718079%
#>
result_kl_mm
#> Optimal design for KL-Optimality:
#> Point Weight
#> 1 1.127578 0.8159831
#> 2 5.000000 0.1840169summary() shows the GLM family and the optimal rival
parameters found by the internal optimisation:
summary(result_kl_mm)
#> Model:
#> y ~ Vmax * x/(Km + x)
#> Weight function:
#> function (x)
#> 1
#>
#> GLM family: Normal (phi = 1)
#> Optimal rival parameters: 0.445
#>
#> Optimal design for KL-Optimality:
#> Point Weight
#> 1 1.127578 0.8159831
#> 2 5.000000 0.1840169
#>
#> Minimum efficiency (Atwood): 99.9957589718079%
#> Criterion value: 0.155804The sensitivity plot shows \(d_{\text{KL}}(x, \xi)\) with the Equivalence Theorem bound (dashed). Support points concentrate in the saturation region where MM and the linear rival differ most:
KL sensitivity function: MM vs linear rival.
Efficiency of a five-point uniform design relative to the KL-optimal:
design_unif <- data.frame(
Point = c(0.1, 1.3, 2.5, 3.8, 5.0),
Weight = rep(1/5, 5)
)
eff_kl <- design_efficiency(design_unif, result_kl_mm)
#> ℹ The efficiency of the design is 44.1071528869909%
cat("Efficiency of uniform design:", round(eff_kl * 100, 2), "%\n")
#> Efficiency of uniform design: 44.11 %Context. A drug clearance study where the event count follows a Poisson process with rate \(\mu(x) = \exp(a - bx)\). The reference model has nominal decay rate \(b = 0.5\). The rival represents a faster-elimination regime (\(b \in [0.8, 1.5]\)), which could arise from drug interactions. The KL divergence for Poisson is \(\text{KL} = \mu_2 - \mu_1 - \mu_1 \log(\mu_2/\mu_1)\); since the rival decays faster, the gap is largest at long follow-up times.
result_kl_pois <- opt_des(
"KL-Optimality",
model = y ~ exp(a - b * x),
parameters = c("a", "b"),
par_values = c(2, 0.5),
design_space = c(0, 4),
rival_pars = c(2, 1.0),
rival_lower = c(1.5, 0.8),
rival_upper = c(2.5, 1.5),
family = "Poisson",
phi = 1
)
#>
#> ℹ Stop condition reached: difference between sensitivity and criterion < 1e-05
#>
⠙ Calculating optimal design 4 done (7.2/s) | 555ms
ℹ The lower bound for efficiency is 99.9999938849073%
#>
result_kl_pois
#> Optimal design for KL-Optimality:
#> Point Weight
#> 1 0 0.2012678
#> 2 4 0.7987322
summary(result_kl_pois)
#> Model:
#> y ~ exp(a - b * x)
#> Weight function:
#> function (x)
#> 1
#>
#> GLM family: Poisson (phi = 1)
#> Optimal rival parameters: 2.2799, 0.8
#>
#> Optimal design for KL-Optimality:
#> Point Weight
#> 1 0 0.2012678
#> 2 4 0.7987322
#>
#> Minimum efficiency (Atwood): 99.9999938849073%
#> Criterion value: 0.318553KL sensitivity for Poisson decay model.
The optimal rival parameters found internally confirm the closest rival within the constraint:
make_kl_fun() for custom KL functionsFor some discrimination problems the KL divergence does not reduce to
the standard cumulant-generating-function form that
opt_des() uses internally. make_kl_fun()
automates the construction of the KL function from two model-family
pairs.
| Combination | Formula |
|---|---|
| Same family, same \(\phi\) | Standard cumulant form |
| Normal vs Normal, \(\phi_1 \neq \phi_2\) | \(\frac{1}{2}\left[\log\frac{\phi_2}{\phi_1} + \frac{\phi_1}{\phi_2} + \frac{(\mu_1-\mu_2)^2}{\phi_2} - 1\right]\) |
| Gamma vs Gamma, different shape | Closed form with digamma/lgamma |
For other pairs, supply kl_fun directly as a function
(x, beta2) -> scalar.
Context. Same exponential decay model, but now the rival has variance \(\phi_2 = 4\) (four times larger). The KL function includes a constant term \(\frac{1}{2}[\log(\phi_2/\phi_1) + \phi_1/\phi_2 - 1]\) even when the means coincide, so the design must also account for the variance difference.
kl_fn_var <- make_kl_fun(
"Normal",
model1 = y ~ a * exp(-b * x),
params1 = c("a", "b"),
par_values1 = c(1, 0.5),
phi1 = 1,
family2 = "Normal",
model2 = y ~ c * exp(-d * x),
params2 = c("c", "d"),
phi2 = 4
)
result_kl_var <- opt_des(
"KL-Optimality",
model = y ~ a * exp(-b * x),
parameters = c("a", "b"),
par_values = c(1, 0.5),
design_space = c(0, 4),
kl_fun = kl_fn_var,
rival_pars = c(1, 1.0),
rival_lower = c(0.5, 0.8),
rival_upper = c(2.0, 1.5)
)
#>
⠙ Calculating optimal design 10 done (4.9/s) | 2.1s
⠹ Calculating optimal design 11 done (4.8/s) | 2.3s
⠸ Calculating optimal design 12 done (4.7/s) | 2.6s
⠼ Calculating optimal design 13 done (4.6/s) | 2.8s
⠴ Calculating optimal design 14 done (4.6/s) | 3.1s
⠦ Calculating optimal design 15 done (4.6/s) | 3.3s
⠧ Calculating optimal design 16 done (4.6/s) | 3.5s
⠇ Calculating optimal design 17 done (4.6/s) | 3.7s
⠏ Calculating optimal design 18 done (4.7/s) | 3.9s
⠋ Calculating optimal design 19 done (4.6/s) | 4.2s
⠙ Calculating optimal design 20 done (4.6/s) | 4.4s
⠹ Calculating optimal design 21 done (4.6/s) | 4.5s
#> ℹ Stop condition not reached, max iterations performed
#>
⠸ Calculating optimal design 22 done (4.7/s) | 4.7s
ℹ The lower bound for efficiency is 99.9671378206639%
#>
result_kl_var
#> Optimal design for KL-Optimality:
#> Point Weight
#> 1 0.000000 0.16826906
#> 2 1.333333 0.07912872
#> 3 1.989481 0.61240029
#> 4 2.666667 0.14020193
summary(result_kl_var)
#> Model:
#> y ~ a * exp(-b * x)
#> Weight function:
#> function (x)
#> 1
#>
#> KL function: user-supplied
#> Optimal rival parameters: 1.1353, 0.8
#>
#> Optimal design for KL-Optimality:
#> Point Weight
#> 1 0.000000 0.16826906
#> 2 1.333333 0.07912872
#> 3 1.989481 0.61240029
#> 4 2.666667 0.14020193
#>
#> Minimum efficiency (Atwood): 99.9671378206639%
#> Criterion value: 0.320445KL sensitivity: Normal phi=1 vs Normal phi=4.
make_kl_fun,
2D)make_kl_fun() works with models that use different
subsets of the design variables. Here the reference uses \((x_1, x_2)\) (bisubstrate MM) while the
rival depends only on \(x_1\)
(single-substrate linear approximation):
kl_fn_2d <- make_kl_fun(
"Normal",
model1 = y ~ Vmax * x1 * x2 / ((Km1 + x1) * (Km2 + x2)),
params1 = c("Vmax", "Km1", "Km2"),
par_values1 = c(1, 1, 1),
model2 = y ~ alpha * x1,
params2 = "alpha"
)
result_kl_2d <- opt_des(
"KL-Optimality",
model = y ~ Vmax * x1 * x2 / ((Km1 + x1) * (Km2 + x2)),
parameters = c("Vmax", "Km1", "Km2"),
par_values = c(1, 1, 1),
design_space = list(x1 = c(0.1, 5), x2 = c(0.1, 5)),
kl_fun = kl_fn_2d,
rival_pars = c(0.5),
rival_lower = c(0.05),
rival_upper = c(2.0)
)
#>
⠙ Calculating optimal design 3 done (1.3/s) | 2.3s
#> ℹ Stop condition reached: difference between sensitivity and criterion < 1e-05
#>
⠹ Calculating optimal design 6 done (2.4/s) | 2.5s
ℹ The lower bound for efficiency is 99.991707653218%
#>
result_kl_2d
#> Optimal design for KL-Optimality (2 factors):
#> x1 x2 Weight
#> 1 2.062645 5.0 0.7079501
#> 2 5.000000 0.1 0.2920499The heatmap concentrates support where the difference between the bisubstrate hyperbola and the linear rival is greatest (saturation regime in both \(x_1\) and \(x_2\)):
KL sensitivity heatmap: 2D MM vs linear rival.
Reference: De la Calle-Arroyo, Leorato, Rodríguez-Aragón & Tommasi (2025). Augmented designs to choose between constant absolute and relative errors. Chemometrics and Intelligent Laboratory Systems, doi:10.1016/j.chemolab.2025.105374.
Context. Dose-response experiments with the Hill (4-parameter Emax) model:
\[\eta(x, \beta) = (E_{\text{con}} - b) \cdot \frac{(x/\text{IC}_{50})^s}{1 + (x/\text{IC}_{50})^s} + b\]
with \(E_{\text{con}} = 1.70\), \(b = 0.137\), \(\text{IC}_{50} = 111\), \(s = -1.03\) and \(x \in [0.01, 1500]\) nM. Two error structures are competing:
The KL divergence minimised over the rival variance \(\sigma_{\text{abs}}^2\) is
\[\text{KL}(x,\, \sigma_{\text{abs}}^2) = \frac{1}{2}\!\left(\frac{\eta^2(x)}{\sigma_{\text{abs}}^2} - 1 + \log\frac{\sigma_{\text{abs}}^2}{\eta^2(x)}\right).\]
We supply this directly as kl_fun since it is not
covered by the standard Normal-vs-Normal formula (the reference variance
is \(x\)-dependent):
kl_fun_hill <- function(x, beta2) {
sigma_sq <- beta2[1]
eta <- (1.70 - 0.137) * (x / 111)^(-1.03) / (1 + (x / 111)^(-1.03)) + 0.137
0.5 * (eta^2 / sigma_sq - 1 + log(sigma_sq / eta^2))
}
result_kl_hill <- opt_des(
"KL-Optimality",
model = y ~ (Econ - b) * (x / IC50)^s / (1 + (x / IC50)^s) + b,
parameters = c("Econ", "b", "IC50", "s"),
par_values = c(1.70, 0.137, 111, -1.03),
design_space = c(0.01, 1500),
kl_fun = kl_fun_hill,
rival_pars = c(1.0),
rival_lower = c(1e-4),
rival_upper = c(1e6)
)
#>
#> ℹ Stop condition reached: difference between sensitivity and criterion < 1e-05
#>
⠙ Calculating optimal design 5 done (40/s) | 127ms
ℹ The lower bound for efficiency is 99.9997190548208%
#>
result_kl_hill
#> Optimal design for KL-Optimality:
#> Point Weight
#> 1 0.01 0.233994
#> 2 1500.00 0.766006
summary(result_kl_hill)
#> Model:
#> y ~ (Econ - b) * (x/IC50)^s/(1 + (x/IC50)^s) + b
#> Weight function:
#> function (x)
#> 1
#>
#> KL function: user-supplied
#> Optimal rival parameters: 0.7192
#>
#> Optimal design for KL-Optimality:
#> Point Weight
#> 1 0.01 0.233994
#> 2 1500.00 0.766006
#>
#> Minimum efficiency (Atwood): 99.9997190548208%
#> Criterion value: 0.813492KL sensitivity for the Hill model (error structure discrimination).
The KL-optimal design places all weight at the two extremes of the design space. This matches the result reported in the paper (support at \(x = 0.01\) with weight 0.23 and \(x = 1500\) with weight 0.77). The Atwood criterion at 100 % confirms the design is exactly optimal.
The optimal rival variance recovered internally:
hv_hill <- attr(result_kl_hill, "hidden_value")
cat("Optimal rival sigma_abs^2 =", round(hv_hill$beta2_star, 4), "\n")
#> Optimal rival sigma_abs^2 = 0.7192Note that with only two support points this design cannot estimate
all four parameters of the Hill model. The paper proposes combining it
with a D-optimal design via the augmentation workflow described in
vignette("optedr-augment").