Adaptive Randomization
Response-adaptive randomization (RAR) and covariate-adaptive minimization for clinical trials. DBCD, Thompson sampling, Neyman allocation, and Pocock-Simon methods with Monte Carlo validation.
Contents
1. Overview & Motivation
Adaptive randomization modifies the probability of treatment assignment based on information accruing during the trial. Two distinct approaches are supported:
Outcome-Adaptive (RAR)
Allocates more patients to arms showing better outcomes. Methods: DBCD, Thompson sampling, Neyman allocation. Supports binary, continuous, and survival endpoints.
Covariate-Adaptive (Minimization)
Balances treatment arms across prognostic factors. Method: Pocock-Simon with configurable determinism, weighted factors, range or variance imbalance measures.
Key distinction: Outcome-adaptive randomization modifies allocation based on efficacy data, while minimization modifies allocation based on baseline covariates. Minimization does not use outcome data and does not introduce the same inferential challenges.
2. Outcome-Adaptive Methods
2.1 DBCD — Doubly-Adaptive Biased Coin Design
The DBCD (Hu & Zhang 2004) targets an optimal allocation ratio using a biased coin function. At each allocation, the probability of assigning to arm depends on the distance between the current allocation and the target:
where and controls convergence speed. Higher produces faster convergence to the target but more variable allocation paths.
Rosenberger Optimal Allocation
For binary endpoints, the Rosenberger optimal allocation minimizes expected failures while maintaining a fixed power level:
where is the response rate for arm . Arms with higher response rates receive more patients, reducing the total expected number of failures (ENF) across the trial.
2.2 Thompson Sampling (Clipped)
Thompson sampling draws from posterior distributions of each arm and allocates proportionally to the probability of being best. The implementation uses clipped allocation bounds to prevent extreme imbalance:
where is the minimum per-arm allocation probability. After clipping, probabilities are renormalized. Clipping is applied iteratively to ensure all arms remain within bounds after renormalization.
Literature note: Thompson sampling has higher wrong-direction imbalance probability than DBCD (Robertson et al. 2023). DBCD is recommended for confirmatory trials; Thompson is more appropriate for exploratory or dose-finding studies.
2.3 Neyman Allocation
Neyman allocation minimizes the variance of the treatment effect estimator by allocating proportionally to each arm's standard deviation:
For binary endpoints, . Unlike DBCD, Neyman allocation is applied as a fixed target — the allocation does not adapt to accruing outcomes but uses the initial estimates. This makes it a useful baseline comparator rather than a fully adaptive method.
2.4 Survival Endpoints
For time-to-event endpoints, the allocation update uses interim event rates rather than response rates. Patients contribute to allocation updates only after a minimum follow-up period. Dropout is applied during the accrual phase (not just at final analysis) so that dropped patients do not influence allocation probabilities.
The event count requirement uses the Schoenfeld formula with Bonferroni adjustment for multi-arm designs:
where is the number of arms. For two-arm designs, this reduces to the standard Schoenfeld formula.
3. Covariate-Adaptive Minimization
Pocock-Simon Method
The Pocock-Simon minimization algorithm (1975) assigns each new patient to the arm that minimizes a weighted imbalance function across prognostic factors. The assignment is probabilistic:
Compute imbalance
For each arm, hypothetically assign the patient and compute the total weighted imbalance across all factor levels. Two measures are supported: range (max − min of arm counts per level) and variance (sample variance of arm counts per level).
Identify minimizing arm
The arm that produces the smallest total weighted imbalance is identified. If multiple arms are tied, one is chosen at random among the tied arms.
Biased coin assignment
Assign to the minimizing arm with probability (typically 0.70–0.85) and to a random other arm with probability . Setting is deterministic minimization; is pure random allocation.
Factor weights: Each prognostic factor can be assigned a weight (default 1.0) reflecting its clinical importance. The total imbalance is the weighted sum across factors. Factors with higher weights receive stronger balance enforcement.
4. Simulation Algorithm
RAR Monte Carlo Procedure
Each simulation trial proceeds through three phases: burn-in, adaptive allocation, and final hypothesis test. The full procedure is repeated under both H₀ (no treatment effect) and H₁ (true effect as specified).
Burn-in (equal allocation)
The first patients are allocated by deterministic round-robin across arms. This guarantees every arm receives enough data for initial rate/mean estimation. Default burn-in fraction .
Adaptive phase
For patients , the allocation probability is recalculated every patients (default ) based on accrued outcome data. The method-specific function (DBCD coin, Thompson posterior draw, or Neyman target) produces allocation probabilities, which are clipped to and renormalized. For survival endpoints, only patients with follow-up exceeding months contribute to rate estimation, and dropout is applied during the accrual phase (not just at final analysis).
Outcome generation
Binary: . Continuous: . Survival: with administrative and dropout censoring.
Final hypothesis test
Two-arm: One-sided z-test (binary/continuous) or logrank test (survival). Multi-arm: Pairwise comparisons against control with Bonferroni correction at . Rejection occurs if any pairwise comparison is significant.
Equal-randomization benchmark
Each RAR trial is paired with an equal-probability randomized trial using the same parameters. The benchmark uses true random allocation (not deterministic cycling), so its operating characteristics reflect realistic randomization variance.
Reproducibility: When a seed is provided, each simulation uses a deterministic RNG chain. The engine stores an input_hash (SHA-256 of all parameters) to verify that repeated runs produce identical results.
Minimization Simulation Procedure
The minimization simulator generates patient factor profiles from the specified prevalence distributions, applies the Pocock-Simon algorithm, and compares balance against a pure-random control.
Generate patient profile
For each patient, sample a level for each factor from the multinomial distribution defined by that factor's prevalences.
Compute hypothetical imbalances
For each candidate arm, hypothetically assign the patient and compute the weighted sum of per-factor imbalances. Range measure: . Variance measure: . Total: .
Assign with biased coin
Assign to the arm minimizing with probability , or to a uniform random other arm with probability .
5. Operating Characteristics
RAR Simulation Outputs
When simulation is enabled, the calculator runs Monte Carlo trials under both H₀ and H₁ to estimate:
| Metric | Description |
|---|---|
| power | Proportion of H₁ simulations rejecting the null |
| type1_error | Proportion of H₀ simulations falsely rejecting |
| ens | Expected number of successes across all arms |
| enf | Expected number of failures across all arms |
| wrong_direction_probability | Probability that the best arm does not receive the most patients |
| comparison_equal | Same metrics under true equal-probability randomization |
Minimization Simulation Outputs
Minimization simulation compares factor-level balance against pure random allocation. Power and Type I error are not applicable since minimization does not modify the hypothesis test.
| Metric | Description |
|---|---|
| factor_balance | Per-factor mean imbalance: minimization vs pure random |
| overall_weighted_imbalance | Weighted mean imbalance with reduction percentage |
| arm_counts | Per-arm sample size distribution (mean ± SD) |
6. Statistical Assumptions
RAR — Outcome-Adaptive
- •Stationarity: Response rates (or hazard rates) are constant over time within each arm. Time trends violate this assumption and can inflate Type I error beyond the nominal level.
- •Independence: Patient outcomes are independent conditional on arm assignment. Clustered or correlated outcomes (e.g., site effects) are not modeled.
- •Known effect structure: The true rates or means per arm are fixed but unknown. The simulator assumes the user-specified values as the true parameters under H₁.
- •Binary: independent Bernoulli. Each outcome is an independent draw with success probability .
- •Continuous: common variance. All arms share a common standard deviation . Heteroscedasticity (different variances per arm) is not modeled.
- •Survival: exponential model. Event times follow an exponential distribution with arm-specific hazard . Non-proportional hazards and cure-rate models are not supported.
Minimization
- •Known factor distributions: Prevalences for each factor level are specified upfront and assumed accurate. Actual enrollment distributions may differ.
- •Factor independence: Patient factor profiles are generated independently. Correlation between factors (e.g., age and comorbidity) is not modeled.
- •No outcome modeling: Minimization only addresses balance, not treatment effect. Power and Type I error depend on the analysis method, not the randomization scheme.
7. Limitations & When Not to Use
When RAR May Not Be Appropriate
Time trends present: If disease severity, standard of care, or patient population changes during enrollment, RAR can dramatically inflate Type I error. Korn & Freidlin (2011) showed inflation from 5% to >25% under realistic drift scenarios. Use stratified randomization or fixed allocation when temporal confounders are plausible.
Confirmatory phase III trials: Proschan & Evans (2020) identify five problems with RAR in confirmatory settings: (1) bias in treatment effect estimates, (2) inflated Type I error, (3) reduced power per patient, (4) predictability enabling selection bias, (5) logistical complexity. DBCD mitigates issues 3–4 but not 1–2.
Delayed outcomes: When response takes months to observe (e.g., 6-month survival endpoint), the adaptive updates lag behind enrollment. With fast accrual and slow response, most patients are enrolled before meaningful adaptation occurs. The calculator warns when the expected event probability at minimum follow-up is below 60%.
Small total N: With fewer than ~50 patients per arm, the burn-in phase consumes most of the sample and the adaptive phase has too few updates to meaningfully deviate from equal allocation.
When Minimization May Not Be Needed
Few or no critical prognostic factors: If no strong prognostic factors are known, simple randomization with stratification is simpler and equally effective.
Large trials: With N >500, the law of large numbers ensures good balance even under simple randomization. Minimization provides marginal benefit.
8. Regulatory Considerations
Outcome-Adaptive Randomization
- •FDA Guidance on Adaptive Designs (2019) permits RAR but recommends careful evaluation of Type I error and bias in treatment effect estimates.
- •Time trends can inflate Type I error from 5% to >25% (Korn & Freidlin 2011). RAR should be used with caution when temporal confounders are plausible.
- •DBCD has near-zero wrong-direction probability; Thompson sampling can have 10–14% wrong-direction probability depending on effect size (Thall, Fox & Wathen 2015).
- •Burn-in period with equal allocation is required to prevent early extreme imbalance. Minimum 20% of total enrollment is recommended.
Minimization
- •EMA and ICH E9 recognize minimization as an acceptable randomization method. CPMP guidance recommends to preserve unpredictability.
- •Deterministic minimization () makes treatment predictable and is generally discouraged for confirmatory trials.
- •Analysis should adjust for minimization factors (stratified analysis or covariate adjustment) to maintain validity.
9. Monte Carlo Validation
The RAR and minimization simulators are validated against known theoretical properties and published benchmarks. Key validation tests include:
Rosenberger Allocation Accuracy
For two-arm binary with , the theoretical optimal is . The computed value matches to 4 decimal places.
Type I Error Control Under H₀
Under H₀ (equal rates across arms), the simulator verifies that rejection rate does not exceed the nominal plus Monte Carlo noise. With 10,000 simulations at , the 95% CI for observed Type I error is .
DBCD Convergence
Mean simulated allocation to the better arm converges to the Rosenberger target as . The allocation trajectory band (p10–p90) narrows with increasing .
Minimization Balance Improvement
With and 2 binary factors, minimization reduces expected imbalance (range measure) by 60–80% compared to pure random allocation, matching published benchmarks in Pocock & Simon (1975).
Reproducibility
Re-running the same request with the same seed produces bit-identical results. The engine verifies this by computing an input hash and comparing output checksums.
Test suite: The RAR calculator is covered by 50 unit tests (allocation functions, simulation, validation rules, survival endpoints) and the minimization calculator by 34 tests. All tests are run on every deployment.
10. API Reference
POST /api/v1/calculators/rar
Outcome-adaptive randomization with analytical allocations and optional Monte Carlo simulation.
Request Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| method | string | "dbcd" | "dbcd", "thompson", or "neyman" |
| endpoint_type | string | "binary" | "binary", "continuous", or "survival" |
| n_arms | int | 2 | Total arms including control [2, 6] |
| n_total | int | 200 | Total sample size [20, 10000] |
| alpha | float | 0.025 | One-sided significance level (0, 1) |
| arm_rates | float[]? | null | Binary: response rate per arm (length = n_arms) |
| arm_means | float[]? | null | Continuous: mean outcome per arm (length = n_arms) |
| common_sd | float? | 1.0 | Continuous: common SD across arms (>0) |
| hazard_ratio | float? | 0.7 | Survival: treatment/control HR (0, 1) |
| median_control | float? | 12.0 | Survival: median control survival in months (>0) |
| accrual_time | float? | 24.0 | Survival: accrual period in months (>0) |
| follow_up_time | float? | 12.0 | Survival: follow-up after accrual in months (≥0) |
| dropout_rate | float? | 0.0 | Survival: annual dropout rate [0, 1) |
| min_follow_up | float? | 3.0 | Survival: min follow-up before patient informs allocation (≥0) |
| burn_in_fraction | float | 0.20 | Fraction of N with equal allocation [0.05, 0.50] |
| allocation_bounds_delta | float | 0.10 | Min per-arm probability [0.01, 0.25]; δ × n_arms < 1 |
| dbcd_gamma | float | 2.0 | DBCD convergence parameter [0.5, 10] |
| update_frequency | int | 1 | Recalculate allocation every N patients [1, 50] |
| simulate | bool | false | Enable Monte Carlo simulation tier |
| simulation_seed | int? | null | Seed for reproducibility; auto-generated if omitted |
| n_simulations | int | 10000 | Number of simulations [1000, 100000] |
Example Request
{
"method": "dbcd",
"endpoint_type": "binary",
"n_arms": 2,
"n_total": 200,
"arm_rates": [0.20, 0.35],
"burn_in_fraction": 0.20,
"allocation_bounds_delta": 0.10,
"dbcd_gamma": 2.0,
"simulate": true,
"n_simulations": 10000
}Example Request (Survival Endpoint)
{
"method": "dbcd",
"endpoint_type": "survival",
"n_arms": 3,
"n_total": 450,
"hazard_ratio": 0.7,
"median_control": 12.0,
"accrual_time": 24.0,
"follow_up_time": 12.0,
"dropout_rate": 0.05,
"min_follow_up": 3.0,
"simulate": true,
"n_simulations": 5000
}Response Fields
| Field | Description |
|---|---|
| rosenberger_optimal_allocation | Optimal allocation proportions per arm |
| neyman_allocation | Neyman variance-minimizing allocation per arm |
| equal_allocation | Equal allocation (1/K per arm) |
| expected_power_equal | Analytical power under equal allocation (2-arm only) |
| events_required_80pct | Survival only: per_comparison, total_approximate, and note |
| expected_event_rates | Survival only: expected event probability per arm at study end |
| design_summary | Human-readable design description |
| regulatory_notes | FDA/EMA guidance citations |
POST /api/v1/calculators/minimization
Pocock-Simon covariate-adaptive minimization with configurable prognostic factors.
Request Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
| n_arms | int | 2 | Total arms including control [2, 6] |
| n_total | int | 200 | Total sample size [20, 10000] |
| p_randomization | float | 0.75 | Probability of minimizing assignment [0.50, 1.0] |
| imbalance_function | string | "range" | "range" (max−min) or "variance" |
| factors | object[] | 2 defaults | Each: name, levels[], prevalences[], weight (1–10 factors) |
| simulate | bool | false | Enable Monte Carlo simulation |
| n_simulations | int | 5000 | Number of simulations [500, 50000] |
Example Request
POST /api/v1/calculators/minimization
{
"n_arms": 2,
"n_total": 200,
"p_randomization": 0.75,
"imbalance_function": "range",
"factors": [
{ "name": "Age", "levels": ["<65", ">=65"],
"prevalences": [0.6, 0.4], "weight": 1.0 },
{ "name": "Sex", "levels": ["M", "F"],
"prevalences": [0.5, 0.5], "weight": 1.0 }
],
"simulate": true,
"n_simulations": 5000
}11. Technical References
- Hu F, Zhang LX (2004). Asymptotic properties of doubly adaptive biased coin designs for multitreatment clinical trials. Annals of Statistics, 32(1), 268–301.
- Rosenberger WF, Stallard N, Ivanova A, Harper CN, Ricks ML (2001). Optimal adaptive designs for binary response trials. Biometrics, 57, 909–913.
- Robertson DS, Choodari-Oskooei B, Dimairo M, Flight L, Pallmann P, Jaki T (2023). Response-adaptive randomization in clinical trials: from myths to practical considerations. Statistical Science, 38(2), 185–208.
- Korn EL, Freidlin B (2011). Outcome-adaptive randomization: is it useful? Journal of Clinical Oncology, 29, 771–776.
- Thall PF, Fox PS, Wathen JK (2015). Statistical controversies in clinical research. Annals of Oncology, 26, 1563–1573.
- Tymofyeyev Y, Rosenberger WF, Hu F (2007). Implementing optimal allocation in sequential binary response experiments. JASA, 102, 224–234.
- Wathen JK, Thall PF (2017). A simulation study of outcome adaptive randomization in multi-arm clinical trials. Clinical Trials, 14(5), 432–440.
- Proschan MA, Evans SR (2020). Resist the temptation of response-adaptive randomization. Clinical Infectious Diseases, 71(11), 3002–3004.
- Pocock SJ, Simon R (1975). Sequential treatment assignment with balancing for prognostic factors. Biometrics, 31, 103–115.
- FDA (2019). Adaptive designs for clinical trials of drugs and biologics: guidance for industry.