Introduction to optedr: optimal designs for non-linear models

The optimal design problem

In non-linear regression the information that an experiment provides about the model parameters depends on where observations are taken, not just how many. The Fisher information matrix at a design point $x$ is

\[M(\xi) = \int f(x)\, f(x)^{\top} \, \xi(dx),\]

where $f(x) = \nabla_\theta \mu(x, \theta)$ is the gradient of the mean response. Different optimality criteria summarise $M(\xi)$ into a single number to minimise:

Criterion	Objective	Key argument
D-Optimality	$-\log\det M$	—
Ds-Optimality	focuses on a subset of parameters	`par_int`
A-Optimality	$\operatorname{tr} M^{-1}$	—
I-Optimality	integrated prediction variance over a region	`reg_int`
L-Optimality	$\operatorname{tr}(B \, M^{-1})$ for user-supplied $B$	`matB`

All are computed with opt_des(), which implements the cocktail algorithm (Yu, 2011).

A first example: D-Optimality in one factor

The Antoine equation models vapour pressure as a function of temperature. With two parameters and one design variable the D-optimal design has exactly two support points.

result_D <- opt_des(
  criterion    = "D-Optimality",
  model        = y ~ a * exp(-b / x),
  parameters   = c("a", "b"),
  par_values   = c(1, 1500),
  design_space = c(212, 422)
)
#> 
#> ℹ Stop condition not reached, max iterations performed
#> 
⠙ Calculating optimal design 22 done (33/s) | 664ms
ℹ The lower bound for efficiency is 99.9982181686267%
#> 
                                                    
result_D
#> Optimal design for D-Optimality:
#>      Point    Weight
#> 1 329.2963 0.5000089
#> 2 422.0000 0.4999911

plot() shows the sensitivity function $d(x, \xi)$. By the Equivalence Theorem the design is optimal if and only if $d(x, \xi) \leq k$ everywhere, with equality at every support point.

plot(result_D)

Sensitivity function for the D-optimal design.

summary() gives a fuller breakdown including the criterion value and convergence diagnostics.

summary(result_D)
#> Model:
#> y ~ a * exp(-b/x)
#> Weight function:
#> function (x) 
#> 1 
#> 
#> Optimal design for D-Optimality:
#>      Point    Weight
#> 1 329.2963 0.5000089
#> 2 422.0000 0.4999911
#> 
#> Minimum efficiency (Atwood): 99.9982181686267%
#> Criterion value:             9.97281e+06

The $atwood component reports the Atwood criterion (percentage deviation from the theoretical optimum bound): values close to 100 % confirm convergence.

Other optimality criteria

Ds-Optimality

par_int selects the indices of the parameters of interest. Here we focus only on th0:

result_Ds <- opt_des(
  criterion    = "Ds-Optimality",
  model        = y ~ th0 * exp(x / th1),
  parameters   = c("th0", "th1"),
  par_values   = c(10.4963, -3.2940),
  design_space = c(0.94, 30),
  par_int      = c(1)
)
#> 
#> ℹ Stop condition reached: difference between sensitivity and criterion < 1e-05
#> 
⠙ Calculating optimal design 9 done (31/s) | 288ms
ℹ The lower bound for efficiency is 99.9998969351635%
#> 
                                                   
result_Ds
#> Optimal design for Ds-Optimality:
#>      Point    Weight
#> 1 0.940000 0.6042053
#> 2 5.147919 0.3957947

A-Optimality

Minimises the average variance of the parameter estimators:

result_A <- opt_des(
  criterion    = "A-Optimality",
  model        = y ~ a * exp(-b / x),
  parameters   = c("a", "b"),
  par_values   = c(1, 1500),
  design_space = c(212, 422)
)
#> 
#> ℹ Stop condition not reached, max iterations performed
#> 
⠙ Calculating optimal design 22 done (35/s) | 629ms
ℹ The lower bound for efficiency is 99.9999982911678%
#> 
                                                    
result_A
#> Optimal design for A-Optimality:
#>      Point    Weight
#> 1 310.3784 0.7821613
#> 2 422.0000 0.2178387

I-Optimality

Minimises the integrated prediction variance over a region of interest reg_int. Useful when prediction in a specific sub-range matters more than global estimation:

result_I <- opt_des(
  criterion    = "I-Optimality",
  model        = y ~ a * exp(-b / x),
  parameters   = c("a", "b"),
  par_values   = c(1, 1500),
  design_space = c(212, 422),
  reg_int      = c(380, 422)
)
#> 
#> ℹ Stop condition reached: difference between sensitivity and criterion < 1e-05
#> 
⠙ Calculating optimal design 5 done (42/s) | 119ms
ℹ The lower bound for efficiency is 99.9999993787934%
#> 
                                                   
result_I
#> Optimal design for I-Optimality:
#>     Point    Weight
#> 1 357.518 0.4346119
#> 2 422.000 0.5653881

L-Optimality

matB is a symmetric positive-semidefinite $k \times k$ matrix that selects which variances to minimise. Setting matB = diag(c(1, 0)) focuses entirely on the variance of a:

result_L <- opt_des(
  criterion    = "L-Optimality",
  model        = y ~ a * exp(-b / x),
  parameters   = c("a", "b"),
  par_values   = c(1, 1500),
  design_space = c(212, 422),
  matB         = diag(c(1, 0))
)
#> 
#> ℹ Stop condition reached: difference between sensitivity and criterion < 1e-05
#> 
⠙ Calculating optimal design 22 done (34/s) | 642ms
ℹ The lower bound for efficiency is 99.9999982893632%
#> 
                                                    
result_L
#> Optimal design for L-Optimality:
#>      Point    Weight
#> 1 310.3784 0.7253377
#> 2 422.0000 0.2746623

Comparing D- and L-optimal designs shows how the support points shift when precision is concentrated on one parameter:

cat("D-optimal support:\n"); print(result_D$optdes)
#> D-optimal support:
#>      Point    Weight
#> 1 329.2963 0.5000089
#> 2 422.0000 0.4999911
cat("L-optimal support:\n"); print(result_L$optdes)
#> L-optimal support:
#>      Point    Weight
#> 1 310.3784 0.7253377
#> 2 422.0000 0.2746623

Multi-dimensional design spaces

For models with two or more design variables pass design_space as a named list. Variable names in the list must match those used in model.

Two-factor model: bisubstrate Michaelis-Menten

Enzyme kinetics with two substrates: $\mu = V_{\max} x_1 x_2 / [(K_1 + x_1)(K_2 + x_2)]$.

result_2D <- opt_des(
  criterion    = "D-Optimality",
  model        = y ~ Vmax * x1 * x2 / ((K1 + x1) * (K2 + x2)),
  parameters   = c("Vmax", "K1", "K2"),
  par_values   = c(1, 1, 1),
  design_space = list(x1 = c(0.1, 10), x2 = c(0.1, 10))
)
#> 
#> ℹ Stop condition reached: difference between sensitivity and criterion < 1e-05
#> 
⠙ Calculating optimal design 14 done (13/s) | 1s
ℹ The lower bound for efficiency is 99.9990270465501%
#> 
                                                 
result_2D
#> Optimal design for D-Optimality (2 factors):
#>           x1         x2    Weight
#> 1  0.8322886 10.0000000 0.3333304
#> 2 10.0000000  0.8343059 0.3333391
#> 3 10.0000000 10.0000000 0.3333304

plot() for a two-factor model returns a viridis heatmap of the sensitivity function, with the support points overlaid in red:

plot(result_2D)

Sensitivity heatmap for the 2D D-optimal design.

L-Optimality extends naturally to multi-dimensional spaces. Here we minimise only the variance of $K_1$:

result_2D_L <- opt_des(
  criterion    = "L-Optimality",
  model        = y ~ Vmax * x1 * x2 / ((K1 + x1) * (K2 + x2)),
  parameters   = c("Vmax", "K1", "K2"),
  par_values   = c(1, 1, 1),
  design_space = list(x1 = c(0.1, 10), x2 = c(0.1, 10)),
  matB         = diag(c(0, 1, 0))
)
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> Warning: The information matrix has 1 near-zero singular value(s) (rank 2 out
#> of 3). The model likely has 1 fewer identifiable parameter(s) than specified
#> -check for parameter redundancy or collinearity.
#> 
#> ℹ Stop condition not reached, max iterations performed
#> 
⠙ Calculating optimal design 22 done (19/s) | 1.2s
#> Warning: The Atwood efficiency bound is -412.6%, which is outside [0, 100].
#> This indicates a degenerate information matrix -the model is likely not
#> identifiable in all specified parameters. Check for parameter redundancy (see
#> any preceding warnings).
#> ℹ The lower bound for efficiency is -412.552180354584%
#> 
                                                   
result_2D_L
#> Optimal design for L-Optimality (2 factors):
#>           x1 x2    Weight
#> 1  0.6043674 10 0.7070278
#> 2 10.0000000 10 0.2929722

Three factors and beyond

For $d \geq 3$ factors plot() switches to a scatter-matrix with $\binom{d}{2}$ panels, one per pair of variables. Point size is proportional to weight.

result_3D <- opt_des(
  criterion    = "D-Optimality",
  model        = y ~ Vmax * x1 * x2 * x3 / ((K1+x1) * (K2+x2) * (K3+x3)),
  parameters   = c("Vmax", "K1", "K2", "K3"),
  par_values   = c(1, 1, 1, 1),
  design_space = list(x1 = c(0.1, 10), x2 = c(0.1, 10), x3 = c(0.1, 10))
)
#> 
⠙ Calculating optimal design 6 done (2.9/s) | 2.1s

⠹ Calculating optimal design 7 done (3/s) | 2.3s  

⠸ Calculating optimal design 8 done (3.1/s) | 2.6s

⠼ Calculating optimal design 9 done (3.1/s) | 2.9s

⠴ Calculating optimal design 10 done (3.2/s) | 3.2s

⠦ Calculating optimal design 11 done (3.2/s) | 3.4s

⠧ Calculating optimal design 12 done (3.2/s) | 3.7s

⠇ Calculating optimal design 13 done (3.3/s) | 4s  

⠏ Calculating optimal design 14 done (3.3/s) | 4.3s

⠋ Calculating optimal design 15 done (3.3/s) | 4.6s

⠙ Calculating optimal design 16 done (3.3/s) | 4.9s

⠹ Calculating optimal design 17 done (3.3/s) | 5.2s

⠸ Calculating optimal design 18 done (3.3/s) | 5.4s

⠼ Calculating optimal design 19 done (3.3/s) | 5.7s
#> ℹ Stop condition reached: difference between sensitivity and criterion < 1e-05
#> 
⠴ Calculating optimal design 20 done (3.3/s) | 6s  
ℹ The lower bound for efficiency is 99.9990957865302%
#> 
                                                  
result_3D
#> Optimal design for D-Optimality (3 factors):
#>           x1         x2        x3    Weight
#> 1  0.8326969 10.0000000 10.000000 0.2499979
#> 2 10.0000000 10.0000000 10.000000 0.2499979
#> 3 10.0000000 10.0000000  0.834917 0.2500063
#> 4 10.0000000  0.8340851 10.000000 0.2499979
plot(result_3D)

Compound criterion

Sometimes no single criterion captures the experimental goals. criterion = "Compound" combines any set of criteria as a weighted sum of their sensitivity functions. Weights are normalised automatically.

result_DI <- opt_des(
  criterion    = "Compound",
  model        = y ~ 10^(a - b / (c + x)),
  parameters   = c("a", "b", "c"),
  par_values   = c(8.07131, 1730.63, 233.426),
  design_space = c(1, 100),
  compound     = list(
    list(criterion = "D-Optimality", weight = 0.7),
    list(criterion = "I-Optimality", weight = 0.3, reg_int = c(60, 100))
  )
)
#> 
#> ℹ Stop condition not reached, max iterations performed
#> 
⠙ Calculating optimal design 22 done (12/s) | 1.8s
ℹ The lower bound for efficiency is 99.9984745356004%
#> 
                                                   
result_DI
#> Optimal design for Compound (0.70*D + 0.30*I):
#>       Point    Weight
#> 1  46.50267 0.3047727
#> 2  82.87444 0.4062833
#> 3 100.00000 0.2889441

Comparing with the pure D-optimal design shows how the composite shifts one support point toward the prediction region:

result_D_ant <- opt_des(
  "D-Optimality",
  y ~ 10^(a - b / (c + x)), c("a", "b", "c"),
  c(8.07131, 1730.63, 233.426), c(1, 100)
)
#> 
#> ℹ Stop condition not reached, max iterations performed
#> 
⠙ Calculating optimal design 22 done (27/s) | 807ms
ℹ The lower bound for efficiency is 99.9986187558838%
#> 
                                                    
cat("D-optimal:\n");         print(result_D_ant$optdes)
#> D-optimal:
#>       Point    Weight
#> 1  44.90080 0.3333415
#> 2  83.19032 0.3333292
#> 3 100.00000 0.3333293
cat("Compound D+I (70/30):\n"); print(result_DI$optdes)
#> Compound D+I (70/30):
#>       Point    Weight
#> 1  46.50267 0.3047727
#> 2  82.87444 0.4062833
#> 3 100.00000 0.2889441

Design efficiency

design_efficiency() measures the fraction of the optimal information that a given design achieves: an efficiency of 0.80 means the design needs $1/0.8 = 1.25\times$ more observations to match the optimal.

design_ad_hoc <- data.frame(
  Point  = c(220, 300, 400),
  Weight = c(1/3, 1/3, 1/3)
)
eff <- design_efficiency(design_ad_hoc, result_D)
#> ℹ The efficiency of the design is 47.3464225136928%
cat("Efficiency of ad-hoc design:", round(eff * 100, 2), "%\n")
#> Efficiency of ad-hoc design: 47.35 %

For multi-dimensional designs pass a data frame with one column per factor plus a Weight column:

corners_2d <- data.frame(
  x1     = c(0.1, 10,  0.1, 10),
  x2     = c(0.1, 0.1, 10,  10),
  Weight = rep(0.25, 4)
)
eff_2d <- design_efficiency(corners_2d, result_2D)
#> ℹ The efficiency of the design is 19.3350707865126%
cat("Efficiency of corner design vs 2D D-optimal:", round(eff_2d * 100, 2), "%\n")
#> Efficiency of corner design vs 2D D-optimal: 19.34 %

Rounding to exact designs

Optimal designs assign continuous weights. Two functions convert them to integer replication counts for a given total $n$.

`efficient_round()`

Uses the multiplier $n - l/2$ (where $l$ is the number of support points) and adjusts to guarantee the total is exactly $n$:

exact_design <- efficient_round(result_D$optdes, n = 20)
print(exact_design)
#>      Point Weight
#> 1 329.2963     10
#> 2 422.0000     10
cat("Total observations:", sum(exact_design$Weight), "\n")
#> Total observations: 20

`combinatorial_round()`

Exhaustive floor/ceiling search: optimal for small $k$ (number of support points) since it checks all $2^k$ combinations. Pass the optdes object so the function can use the criterion value for comparison:

combo_design <- combinatorial_round(result_D, n = 10)
print(combo_design)
#>      Point Weight
#> 1 329.2963      5
#> 2 422.0000      5

Once rounded, compare the exact design’s efficiency against the approximate optimum:

approx <- exact_design
approx$Weight <- approx$Weight / sum(approx$Weight)
cat("Efficiency of rounded design:", round(design_efficiency(approx, result_D) * 100, 2), "%\n")
#> ℹ The efficiency of the design is 100.000000015874%
#> Efficiency of rounded design: 100 %

Criterion	Objective	Key argument
D-Optimality	\(-\log\det M\)	—
Ds-Optimality	focuses on a subset of parameters	`par_int`
A-Optimality	\(\operatorname{tr} M^{-1}\)	—
I-Optimality	integrated prediction variance over a region	`reg_int`
L-Optimality	\(\operatorname{tr}(B \, M^{-1})\) for user-supplied \(B\)	`matB`