Documentation/Technical Reference

Adaptive Randomization

Response-adaptive randomization (RAR) and covariate-adaptive minimization for clinical trials. DBCD, Thompson sampling, Neyman allocation, and Pocock-Simon methods with Monte Carlo validation.

1. Overview & Motivation

Adaptive randomization modifies the probability of treatment assignment based on information accruing during the trial. Two distinct approaches are supported:

Outcome-Adaptive (RAR)

Allocates more patients to arms showing better outcomes. Methods: DBCD, Thompson sampling, Neyman allocation. Supports binary, continuous, and survival endpoints.

Covariate-Adaptive (Minimization)

Balances treatment arms across prognostic factors. Method: Pocock-Simon with configurable determinism, weighted factors, range or variance imbalance measures.

Key distinction: Outcome-adaptive randomization modifies allocation based on efficacy data, while minimization modifies allocation based on baseline covariates. Minimization does not use outcome data and does not introduce the same inferential challenges.

2. Outcome-Adaptive Methods

2.1 DBCD — Doubly-Adaptive Biased Coin Design

The DBCD (Hu & Zhang 2004) targets an optimal allocation ratio using a biased coin function. At each allocation, the probability of assigning to arm jj depends on the distance between the current allocation and the target:

P(arm j)=g(ρjρ^j)/kg(ρkρ^k)P(\text{arm } j) = g\left(\frac{\rho_j^*}{\hat{\rho}_j}\right) \bigg/ \sum_k g\left(\frac{\rho_k^*}{\hat{\rho}_k}\right)

where g(x)=xγg(x) = x^\gamma and γ0\gamma \geq 0 controls convergence speed. Higher γ\gamma produces faster convergence to the target but more variable allocation paths.

Rosenberger Optimal Allocation

For binary endpoints, the Rosenberger optimal allocation minimizes expected failures while maintaining a fixed power level:

ρj=pjkpk\rho_j^* = \frac{\sqrt{p_j}}{\sum_k \sqrt{p_k}}

where pjp_j is the response rate for arm jj. Arms with higher response rates receive more patients, reducing the total expected number of failures (ENF) across the trial.

2.2 Thompson Sampling (Clipped)

Thompson sampling draws from posterior distributions of each arm and allocates proportionally to the probability of being best. The implementation uses clipped allocation bounds to prevent extreme imbalance:

πj=clip(P(θj=maxkθk),  δ,  1(K1)δ)\pi_j = \text{clip}\left(P(\theta_j = \max_k \theta_k), \;\delta, \;1 - (K-1)\delta\right)

where δ\delta is the minimum per-arm allocation probability. After clipping, probabilities are renormalized. Clipping is applied iteratively to ensure all arms remain within bounds after renormalization.

Literature note: Thompson sampling has higher wrong-direction imbalance probability than DBCD (Robertson et al. 2023). DBCD is recommended for confirmatory trials; Thompson is more appropriate for exploratory or dose-finding studies.

2.3 Neyman Allocation

Neyman allocation minimizes the variance of the treatment effect estimator by allocating proportionally to each arm's standard deviation:

ρj=σjkσk\rho_j^* = \frac{\sigma_j}{\sum_k \sigma_k}

For binary endpoints, σj=pj(1pj)\sigma_j = \sqrt{p_j(1-p_j)}. Unlike DBCD, Neyman allocation is applied as a fixed target — the allocation does not adapt to accruing outcomes but uses the initial estimates. This makes it a useful baseline comparator rather than a fully adaptive method.

2.4 Survival Endpoints

For time-to-event endpoints, the allocation update uses interim event rates rather than response rates. Patients contribute to allocation updates only after a minimum follow-up period. Dropout is applied during the accrual phase (not just at final analysis) so that dropped patients do not influence allocation probabilities.

The event count requirement uses the Schoenfeld formula with Bonferroni adjustment for multi-arm designs:

d=(zα/(K1)+zβ)2(logHR2)2d = \frac{(z_{\alpha/(K-1)} + z_\beta)^2}{\left(\frac{\log \text{HR}}{2}\right)^2}

where KK is the number of arms. For two-arm designs, this reduces to the standard Schoenfeld formula.

3. Covariate-Adaptive Minimization

Pocock-Simon Method

The Pocock-Simon minimization algorithm (1975) assigns each new patient to the arm that minimizes a weighted imbalance function across prognostic factors. The assignment is probabilistic:

1

Compute imbalance

For each arm, hypothetically assign the patient and compute the total weighted imbalance across all factor levels. Two measures are supported: range (max − min of arm counts per level) and variance (sample variance of arm counts per level).

2

Identify minimizing arm

The arm that produces the smallest total weighted imbalance is identified. If multiple arms are tied, one is chosen at random among the tied arms.

3

Biased coin assignment

Assign to the minimizing arm with probability pp (typically 0.70–0.85) and to a random other arm with probability 1p1-p. Setting p=1.0p = 1.0 is deterministic minimization; p=0.5p = 0.5 is pure random allocation.

Factor weights: Each prognostic factor can be assigned a weight (default 1.0) reflecting its clinical importance. The total imbalance is the weighted sum across factors. Factors with higher weights receive stronger balance enforcement.

4. Simulation Algorithm

RAR Monte Carlo Procedure

Each simulation trial proceeds through three phases: burn-in, adaptive allocation, and final hypothesis test. The full procedure is repeated under both H₀ (no treatment effect) and H₁ (true effect as specified).

1

Burn-in (equal allocation)

The first nburn=fbNn_{\text{burn}} = \lfloor f_b \cdot N \rfloor patients are allocated by deterministic round-robin across arms. This guarantees every arm receives enough data for initial rate/mean estimation. Default burn-in fraction fb=0.20f_b = 0.20.

2

Adaptive phase

For patients i>nburni > n_{\text{burn}}, the allocation probability is recalculated every uu patients (default u=1u=1) based on accrued outcome data. The method-specific function (DBCD coin, Thompson posterior draw, or Neyman target) produces allocation probabilities, which are clipped to [δ,1(K1)δ][\delta,\, 1-(K-1)\delta] and renormalized. For survival endpoints, only patients with follow-up exceeding tmint_{\min} months contribute to rate estimation, and dropout is applied during the accrual phase (not just at final analysis).

3

Outcome generation

Binary: YiBernoulli(pai)Y_i \sim \text{Bernoulli}(p_{a_i}). Continuous: YiN(μai,σ2)Y_i \sim N(\mu_{a_i}, \sigma^2). Survival: TiExp(λai)T_i \sim \text{Exp}(\lambda_{a_i}) with administrative and dropout censoring.

4

Final hypothesis test

Two-arm: One-sided z-test (binary/continuous) or logrank test (survival). Multi-arm: Pairwise comparisons against control with Bonferroni correction at α/(K1)\alpha / (K-1). Rejection occurs if any pairwise comparison is significant.

5

Equal-randomization benchmark

Each RAR trial is paired with an equal-probability randomized trial using the same parameters. The benchmark uses true random allocation (not deterministic cycling), so its operating characteristics reflect realistic randomization variance.

Reproducibility: When a seed is provided, each simulation uses a deterministic RNG chain. The engine stores an input_hash (SHA-256 of all parameters) to verify that repeated runs produce identical results.

Minimization Simulation Procedure

The minimization simulator generates patient factor profiles from the specified prevalence distributions, applies the Pocock-Simon algorithm, and compares balance against a pure-random control.

1

Generate patient profile

For each patient, sample a level for each factor from the multinomial distribution defined by that factor's prevalences.

2

Compute hypothetical imbalances

For each candidate arm, hypothetically assign the patient and compute the weighted sum of per-factor imbalances. Range measure: Gf=maxjnfjminjnfjG_f = \max_j n_{fj} - \min_j n_{fj}. Variance measure: Gf=Var(nf1,,nfK)G_f = \text{Var}(n_{f1}, \ldots, n_{fK}). Total: Dk=fwfGf(k)D_k = \sum_f w_f \cdot G_f(k).

3

Assign with biased coin

Assign to the arm minimizing DkD_k with probability pp, or to a uniform random other arm with probability 1p1-p.

5. Operating Characteristics

RAR Simulation Outputs

When simulation is enabled, the calculator runs Monte Carlo trials under both H₀ and H₁ to estimate:

MetricDescription
powerProportion of H₁ simulations rejecting the null
type1_errorProportion of H₀ simulations falsely rejecting
ensExpected number of successes across all arms
enfExpected number of failures across all arms
wrong_direction_probabilityProbability that the best arm does not receive the most patients
comparison_equalSame metrics under true equal-probability randomization

Minimization Simulation Outputs

Minimization simulation compares factor-level balance against pure random allocation. Power and Type I error are not applicable since minimization does not modify the hypothesis test.

MetricDescription
factor_balancePer-factor mean imbalance: minimization vs pure random
overall_weighted_imbalanceWeighted mean imbalance with reduction percentage
arm_countsPer-arm sample size distribution (mean ± SD)

6. Statistical Assumptions

RAR — Outcome-Adaptive

  • Stationarity: Response rates (or hazard rates) are constant over time within each arm. Time trends violate this assumption and can inflate Type I error beyond the nominal level.
  • Independence: Patient outcomes are independent conditional on arm assignment. Clustered or correlated outcomes (e.g., site effects) are not modeled.
  • Known effect structure: The true rates or means per arm are fixed but unknown. The simulator assumes the user-specified values as the true parameters under H₁.
  • Binary: independent Bernoulli. Each outcome is an independent draw with success probability pjp_j.
  • Continuous: common variance. All arms share a common standard deviation σ\sigma. Heteroscedasticity (different variances per arm) is not modeled.
  • Survival: exponential model. Event times follow an exponential distribution with arm-specific hazard λj=ln(2)/mj\lambda_j = \ln(2)/m_j. Non-proportional hazards and cure-rate models are not supported.

Minimization

  • Known factor distributions: Prevalences for each factor level are specified upfront and assumed accurate. Actual enrollment distributions may differ.
  • Factor independence: Patient factor profiles are generated independently. Correlation between factors (e.g., age and comorbidity) is not modeled.
  • No outcome modeling: Minimization only addresses balance, not treatment effect. Power and Type I error depend on the analysis method, not the randomization scheme.

7. Limitations & When Not to Use

When RAR May Not Be Appropriate

Time trends present: If disease severity, standard of care, or patient population changes during enrollment, RAR can dramatically inflate Type I error. Korn & Freidlin (2011) showed inflation from 5% to >25% under realistic drift scenarios. Use stratified randomization or fixed allocation when temporal confounders are plausible.

Confirmatory phase III trials: Proschan & Evans (2020) identify five problems with RAR in confirmatory settings: (1) bias in treatment effect estimates, (2) inflated Type I error, (3) reduced power per patient, (4) predictability enabling selection bias, (5) logistical complexity. DBCD mitigates issues 3–4 but not 1–2.

Delayed outcomes: When response takes months to observe (e.g., 6-month survival endpoint), the adaptive updates lag behind enrollment. With fast accrual and slow response, most patients are enrolled before meaningful adaptation occurs. The calculator warns when the expected event probability at minimum follow-up is below 60%.

Small total N: With fewer than ~50 patients per arm, the burn-in phase consumes most of the sample and the adaptive phase has too few updates to meaningfully deviate from equal allocation.

When Minimization May Not Be Needed

Few or no critical prognostic factors: If no strong prognostic factors are known, simple randomization with stratification is simpler and equally effective.

Large trials: With N >500, the law of large numbers ensures good balance even under simple randomization. Minimization provides marginal benefit.

8. Regulatory Considerations

Outcome-Adaptive Randomization

  • FDA Guidance on Adaptive Designs (2019) permits RAR but recommends careful evaluation of Type I error and bias in treatment effect estimates.
  • Time trends can inflate Type I error from 5% to >25% (Korn & Freidlin 2011). RAR should be used with caution when temporal confounders are plausible.
  • DBCD has near-zero wrong-direction probability; Thompson sampling can have 10–14% wrong-direction probability depending on effect size (Thall, Fox & Wathen 2015).
  • Burn-in period with equal allocation is required to prevent early extreme imbalance. Minimum 20% of total enrollment is recommended.

Minimization

  • EMA and ICH E9 recognize minimization as an acceptable randomization method. CPMP guidance recommends p0.80p \leq 0.80 to preserve unpredictability.
  • Deterministic minimization (p=1.0p = 1.0) makes treatment predictable and is generally discouraged for confirmatory trials.
  • Analysis should adjust for minimization factors (stratified analysis or covariate adjustment) to maintain validity.

9. Monte Carlo Validation

The RAR and minimization simulators are validated against known theoretical properties and published benchmarks. Key validation tests include:

Rosenberger Allocation Accuracy

For two-arm binary with p0=0.2,p1=0.4p_0 = 0.2, p_1 = 0.4, the theoretical optimal is ρ=0.4/(0.2+0.4)0.586\rho^* = \sqrt{0.4}/(\sqrt{0.2}+\sqrt{0.4}) \approx 0.586. The computed value matches to 4 decimal places.

Type I Error Control Under H₀

Under H₀ (equal rates across arms), the simulator verifies that rejection rate does not exceed the nominal α\alpha plus Monte Carlo noise. With 10,000 simulations at α=0.025\alpha = 0.025, the 95% CI for observed Type I error is [0.022,0.028][0.022, 0.028].

DBCD Convergence

Mean simulated allocation to the better arm converges to the Rosenberger target as NN \to \infty. The allocation trajectory band (p10–p90) narrows with increasing γ\gamma.

Minimization Balance Improvement

With p=0.75p = 0.75 and 2 binary factors, minimization reduces expected imbalance (range measure) by 60–80% compared to pure random allocation, matching published benchmarks in Pocock & Simon (1975).

Reproducibility

Re-running the same request with the same seed produces bit-identical results. The engine verifies this by computing an input hash and comparing output checksums.

Test suite: The RAR calculator is covered by 50 unit tests (allocation functions, simulation, validation rules, survival endpoints) and the minimization calculator by 34 tests. All tests are run on every deployment.

10. API Reference

POST /api/v1/calculators/rar

Outcome-adaptive randomization with analytical allocations and optional Monte Carlo simulation.

Request Parameters

ParameterTypeDefaultDescription
methodstring"dbcd""dbcd", "thompson", or "neyman"
endpoint_typestring"binary""binary", "continuous", or "survival"
n_armsint2Total arms including control [2, 6]
n_totalint200Total sample size [20, 10000]
alphafloat0.025One-sided significance level (0, 1)
arm_ratesfloat[]?nullBinary: response rate per arm (length = n_arms)
arm_meansfloat[]?nullContinuous: mean outcome per arm (length = n_arms)
common_sdfloat?1.0Continuous: common SD across arms (>0)
hazard_ratiofloat?0.7Survival: treatment/control HR (0, 1)
median_controlfloat?12.0Survival: median control survival in months (>0)
accrual_timefloat?24.0Survival: accrual period in months (>0)
follow_up_timefloat?12.0Survival: follow-up after accrual in months (≥0)
dropout_ratefloat?0.0Survival: annual dropout rate [0, 1)
min_follow_upfloat?3.0Survival: min follow-up before patient informs allocation (≥0)
burn_in_fractionfloat0.20Fraction of N with equal allocation [0.05, 0.50]
allocation_bounds_deltafloat0.10Min per-arm probability [0.01, 0.25]; δ × n_arms < 1
dbcd_gammafloat2.0DBCD convergence parameter [0.5, 10]
update_frequencyint1Recalculate allocation every N patients [1, 50]
simulateboolfalseEnable Monte Carlo simulation tier
simulation_seedint?nullSeed for reproducibility; auto-generated if omitted
n_simulationsint10000Number of simulations [1000, 100000]

Example Request

{
  "method": "dbcd",
  "endpoint_type": "binary",
  "n_arms": 2,
  "n_total": 200,
  "arm_rates": [0.20, 0.35],
  "burn_in_fraction": 0.20,
  "allocation_bounds_delta": 0.10,
  "dbcd_gamma": 2.0,
  "simulate": true,
  "n_simulations": 10000
}

Example Request (Survival Endpoint)

{
  "method": "dbcd",
  "endpoint_type": "survival",
  "n_arms": 3,
  "n_total": 450,
  "hazard_ratio": 0.7,
  "median_control": 12.0,
  "accrual_time": 24.0,
  "follow_up_time": 12.0,
  "dropout_rate": 0.05,
  "min_follow_up": 3.0,
  "simulate": true,
  "n_simulations": 5000
}

Response Fields

FieldDescription
rosenberger_optimal_allocationOptimal allocation proportions per arm
neyman_allocationNeyman variance-minimizing allocation per arm
equal_allocationEqual allocation (1/K per arm)
expected_power_equalAnalytical power under equal allocation (2-arm only)
events_required_80pctSurvival only: per_comparison, total_approximate, and note
expected_event_ratesSurvival only: expected event probability per arm at study end
design_summaryHuman-readable design description
regulatory_notesFDA/EMA guidance citations

POST /api/v1/calculators/minimization

Pocock-Simon covariate-adaptive minimization with configurable prognostic factors.

Request Parameters

ParameterTypeDefaultDescription
n_armsint2Total arms including control [2, 6]
n_totalint200Total sample size [20, 10000]
p_randomizationfloat0.75Probability of minimizing assignment [0.50, 1.0]
imbalance_functionstring"range""range" (max−min) or "variance"
factorsobject[]2 defaultsEach: name, levels[], prevalences[], weight (1–10 factors)
simulateboolfalseEnable Monte Carlo simulation
n_simulationsint5000Number of simulations [500, 50000]

Example Request

POST /api/v1/calculators/minimization
{
  "n_arms": 2,
  "n_total": 200,
  "p_randomization": 0.75,
  "imbalance_function": "range",
  "factors": [
    { "name": "Age", "levels": ["<65", ">=65"],
      "prevalences": [0.6, 0.4], "weight": 1.0 },
    { "name": "Sex", "levels": ["M", "F"],
      "prevalences": [0.5, 0.5], "weight": 1.0 }
  ],
  "simulate": true,
  "n_simulations": 5000
}

11. Technical References

  1. Hu F, Zhang LX (2004). Asymptotic properties of doubly adaptive biased coin designs for multitreatment clinical trials. Annals of Statistics, 32(1), 268–301.
  2. Rosenberger WF, Stallard N, Ivanova A, Harper CN, Ricks ML (2001). Optimal adaptive designs for binary response trials. Biometrics, 57, 909–913.
  3. Robertson DS, Choodari-Oskooei B, Dimairo M, Flight L, Pallmann P, Jaki T (2023). Response-adaptive randomization in clinical trials: from myths to practical considerations. Statistical Science, 38(2), 185–208.
  4. Korn EL, Freidlin B (2011). Outcome-adaptive randomization: is it useful? Journal of Clinical Oncology, 29, 771–776.
  5. Thall PF, Fox PS, Wathen JK (2015). Statistical controversies in clinical research. Annals of Oncology, 26, 1563–1573.
  6. Tymofyeyev Y, Rosenberger WF, Hu F (2007). Implementing optimal allocation in sequential binary response experiments. JASA, 102, 224–234.
  7. Wathen JK, Thall PF (2017). A simulation study of outcome adaptive randomization in multi-arm clinical trials. Clinical Trials, 14(5), 432–440.
  8. Proschan MA, Evans SR (2020). Resist the temptation of response-adaptive randomization. Clinical Infectious Diseases, 71(11), 3002–3004.
  9. Pocock SJ, Simon R (1975). Sequential treatment assignment with balancing for prognostic factors. Biometrics, 31, 103–115.
  10. FDA (2019). Adaptive designs for clinical trials of drugs and biologics: guidance for industry.