Documentation/Technical Reference

Platform Trial Design

Multi-arm, multi-stage (MAMS) trials with staggered arm entry, non-concurrent control strategies, and adaptive stopping rules. Frequentist and Bayesian analysis with binary, continuous, and survival endpoints and Monte Carlo operating characteristics.

1. Overview & Motivation

A platform trial is a multi-arm, multi-stage (MAMS) clinical trial conducted under a single master protocol that evaluates multiple treatments against a shared control arm. Unlike traditional fixed-design trials, a platform trial allows experimental arms to enter and exit the trial at different times based on pre-specified decision rules—arms showing futility are dropped early, arms demonstrating efficacy may graduate, and new arms can be added as they become available. The shared control arm persists throughout the life of the platform.

Perpetual Design

A platform trial has no fixed end date. Arms enter and exit according to interim decision rules, and the infrastructure (screening, randomization, data collection) persists indefinitely. This “living protocol” approach dramatically reduces startup costs for each new treatment evaluation.

Shared Infrastructure

All arms share a single control arm, screening platform, randomization system, and data monitoring committee. Patients randomized to control contribute comparator data for every active arm concurrently enrolling, yielding substantial efficiency gains over separate parallel trials.

Key distinction: A basket trial tests one treatment across many indications sharing a molecular target. An umbrella trial tests many treatments within one disease, stratified by biomarker. A platform trial adds the temporal dimension—arms enter and exit over time with adaptive stopping rules. This calculator addresses the platform design specifically.

Landmark Examples

•RECOVERY: The Randomised Evaluation of COVID-19 Therapy trial, launched in March 2020, became the definitive platform trial of the pandemic era. It identified dexamethasone as the first life-saving treatment for severe COVID-19, along with evaluating tocilizumab, baricitinib, and over a dozen other arms—all under a single adaptive protocol.
•I-SPY 2: A Bayesian adaptive platform trial in neoadjuvant breast cancer that uses response-adaptive randomization to graduate promising agents to phase III. Multiple therapies have graduated from I-SPY 2, including neratinib and pembrolizumab, demonstrating the platform model for accelerated drug development.
•STAMPEDE: A multi-arm, multi-stage platform trial in advanced prostate cancer running since 2005. STAMPEDE has evaluated over 10 experimental arms against a common control, using intermediate outcome measures for early futility stopping and definitive survival endpoints for efficacy graduation.
•GBM AGILE: A Bayesian adaptive platform trial for glioblastoma using response-adaptive randomization within biomarker subtypes. GBM AGILE integrates seamless phase II/III evaluation with Bayesian interim analyses to rapidly identify effective therapies in a lethal cancer with high unmet need.
•NCI ComboMATCH: A precision medicine platform trial testing combination therapies matched to tumor molecular profiles. ComboMATCH uses a shared screening infrastructure to assign patients to biomarker-defined treatment arms, with arms opening and closing based on emerging evidence—combining platform trial logistics with biomarker-driven assignment.

When to Use a Platform Trial

•Multiple treatments to evaluate: There are several candidate treatments targeting the same disease or condition, and evaluating them sequentially in separate trials would be prohibitively slow and expensive.
•Evolving treatment landscape: New treatments are expected to become available during the trial period and should be evaluable without starting new protocols from scratch.
•Shared patient population: All treatments target the same disease population with the same standard-of-care control, making a common control arm scientifically defensible.
•Adaptive decision-making: There is a desire to stop futile arms early and reallocate resources to more promising treatments, with pre-specified interim analyses at each stage.

Comparison to Basket and Umbrella Trials

Feature	Basket	Umbrella	Platform
Treatments	One	Multiple (biomarker-matched)	Multiple (added/dropped over time)
Indications	Multiple	One	One (typically)
Control arm	Per-indication or none	Shared (concurrent)	Shared (concurrent + non-concurrent)
Temporal structure	Fixed	Fixed	Staggered entry/exit
Interim decisions	Optional	Optional	Required (efficacy + futility)

2. MAMS Boundaries

Frequentist Multi-Stage Boundaries

At each stage $k = 1, \ldots, K$ , each active arm $j$ is tested against the shared control using a stage-specific boundary. An arm is stopped for efficacy if its test statistic exceeds the upper boundary, or stopped for futility if it falls below the lower boundary.

Alpha-Spending Functions

When boundary_method = "spending", the per-arm type I error is allocated across stages using an alpha-spending function $\alpha^*(t)$ where $t = k / K$ is the information fraction:

O’Brien-Fleming

\alpha^*_{\text{OBF}}(t) = 2 - 2\Phi\!\left(\frac{z_{\alpha/2}}{\sqrt{t}}\right)

Conservative at early looks, spending very little alpha initially and reserving most for the final analysis. This yields high boundaries at interim stages, making early stopping for efficacy rare unless the effect is very large.

Pocock

\alpha^*_{\text{Pocock}}(t) = \alpha \cdot \ln(1 + (e - 1) \cdot t)

Spends alpha more evenly across stages, producing roughly equal boundaries at each look. This makes early stopping more likely but requires a slightly higher final-stage threshold compared to O’Brien-Fleming.

Bonferroni Multiplicity

When boundary_method = "bonferroni", the overall significance level is divided across the total number of planned experimental arms $J$ , then spent over stages via the chosen alpha-spending function for each arm's look schedule:

\alpha^{\text{per-arm}} = \frac{\alpha}{J} \,, \quad \alpha_{j,k} = \alpha^*\!(\tau_{j,k}; \alpha^{\text{per-arm}})

Using total planned arms $J$ rather than the per-stage active count is what protects strong FWER control under staggered arm entry: if multiple arms each spent close to $\alpha$ just because they happen to be the only arm active at their entry stage, the familywise error rate would inflate above $\alpha$ . The simulation block in the calculator (run_h0=true) then verifies the empirical FWER under the staggered schedule. For more aggressive designs that need per-stage allocation, use a prespecified alpha-allocation graph and rely on the simulation-calibrated FWER report.

Per-Arm Efficacy and Futility Stopping

At each stage $k$ , for each active arm $j$ , the test statistic $z_{j,k}$ is compared against two boundaries:

\text{Efficacy: } z_{j,k} > u_k \quad \Rightarrow \quad \text{Stop arm } j \text{ for efficacy (graduate)}

\text{Futility: } z_{j,k} < l_k \quad \Rightarrow \quad \text{Stop arm } j \text{ for futility (drop)}

Arms that fall between the boundaries continue to the next stage. The efficacy boundary $u_k$ is derived from the alpha-spending function. The futility boundary $l_k$ is derived from the Bayesian futility threshold (posterior probability of benefit below $\gamma_f$ , default 0.10).

Bayesian Decision Rules

Under Bayesian analysis, each arm is evaluated at every stage using posterior probabilities:

\text{Efficacy: } P(\theta_j > 0 \mid \text{data}_k) > \gamma_e \quad \Rightarrow \quad \text{Graduate arm } j

\text{Futility: } P(\theta_j > 0 \mid \text{data}_k) < \gamma_f \quad \Rightarrow \quad \text{Drop arm } j

where $\gamma_e$ is the efficacy threshold (default 0.975) and $\gamma_f$ is the futility threshold (default 0.10). The parameter $\theta_j$ represents the treatment effect for arm $j$ (difference in proportions, mean difference, or log-hazard ratio, depending on endpoint type).

Binding vs. non-binding futility: By default, futility boundaries are non-binding—they are advisory recommendations to drop an arm, but the FWER calculation assumes the trial could continue even if the futility boundary is crossed. This preserves FWER control under the spending function approach even if the DSMB overrides a futility recommendation.

3. Non-Concurrent Control

Because arms enter the platform at different stages, a key design decision is how to handle control data collected before an arm entered the trial. Control patients enrolled during stages when arm $j$ was not yet active are “non-concurrent” with respect to that arm. Three strategies are available:

Concurrent Only

control_type = "concurrent_only"

Each arm is compared only to control patients enrolled during the stages when that arm was active. This is the most conservative approach and is free from temporal bias, but discards potentially useful control data. An arm entering at stage 3 of a 5-stage platform uses only the control patients from stages 3–5.

Pooled with Time Adjustment

control_type = "pooled_adjusted"

For binary and continuous endpoints, all control data is included with non-concurrent stages down-weighted to 50% of their original sample size, providing a conservative borrowing of historical control information. For survival endpoints, non-concurrent control data is excluded entirely because individual event times cannot be meaningfully down-weighted—this makes pooled_adjusted equivalent to concurrent_only for survival.

Pooled Naive

control_type = "pooled_naive"

All control data is pooled without any time adjustment. This maximizes the control sample size and statistical power but is susceptible to temporal bias—if the patient population, standard of care, or diagnostic practices change over time, the non-concurrent control data may not be representative of the concurrent population.

Risks and Tradeoffs

Strategy	Bias Risk	Power	Recommended When
concurrent_only	None	Lowest	Temporal trends are suspected or standard of care is evolving
pooled_adjusted	Low (binary/continuous: 50% down-weight; survival: excluded)	Moderate (binary/continuous) / Same as concurrent (survival)	Stable population; survival reverts to concurrent-only
pooled_naive	High	Highest	Very stable disease and population over the platform lifetime

Simulation recommendation: Always simulate the operating characteristics under each control strategy to quantify the bias-variance tradeoff for your specific setting. The simulation engine applies the chosen control strategy at each interim look, so the OC results reflect the actual decision behavior under temporal variation.

4. Staggered Arm Entry

A distinguishing feature of platform trials is that experimental arms may enter the platform at different stages. Each arm has an entry_stage parameter (1-indexed) specifying when it begins enrolling. An arm entering at stage $s_j$ in a $K$ -stage platform has $K - s_j + 1$ stages of data collection.

Enrollment and Randomization

At each stage, patients are randomized equally among all active experimental arms plus the control arm. If $J_k$ arms are active at stage $k$ , each arm (including control) receives $n_{\text{per\_stage}}$ patients. The total control arm size for an arm entering at stage $s_j$ depends on the control strategy:

n_{\text{ctrl}}^{(j)} = \begin{cases} \sum_{k=s_j}^{K} n_k & \text{concurrent\_only (and pooled\_adjusted for survival)} \\ \sum_{k=s_j}^{K} n_k + 0.5 \sum_{k=1}^{s_j-1} n_k & \text{pooled\_adjusted (binary/continuous)} \\ \sum_{k=1}^{K} n_k & \text{pooled\_naive} \end{cases}

Multiplicity Adjustment

Boundaries are computed once at the start of the trial using the initial number of experimental arms $J$ . Under Bonferroni multiplicity, the per-arm significance level is fixed:

\alpha_j = \frac{\alpha}{J}

This is intentionally conservative: when arms drop for futility, the boundaries do not become less stringent. This avoids outcome-dependent multiplicity adjustment, which would require a pre-specified alpha-recycling procedure. Under alpha-spending (boundary_method = "spending"), each arm maintains its own independent spending trajectory with Bonferroni-adjusted level $\alpha / J$ . The spending function (O'Brien-Fleming or Pocock) controls how that per-arm alpha is distributed across stages, while Bonferroni controls multiplicity across arms. The Dunnett-type correlation from the shared control is not exploited, making the approach conservative.

Late-entering arms: Arms entering at later stages have fewer interim looks and accumulate less data than arms present from the start. This typically results in lower power for late-entering arms unless the per-stage sample size is increased to compensate. The simulation engine captures this effect automatically.

Example: 3-Stage Platform with Staggered Entry

Stage	Active Arms	Randomization	Decisions
1	A, B + Control	1:1:1	Interim: A and B tested; futile arms dropped
2	A (if active), B (if active), C (enters) + Control	1:1:1:1 (or fewer if arms dropped)	Interim: all active arms tested; C has first look
3	Remaining active arms + Control	Equal allocation	Final: definitive efficacy/futility decisions

5. Simulation Algorithm

Monte Carlo Procedure

The platform trial simulator estimates operating characteristics by repeating the full multi-arm, multi-stage design—staggered enrollment, stage-wise randomization, per-arm interim analyses, and adaptive stopping rules—across many simulated datasets under user-specified truth scenarios.

Specify the truth

Define the true treatment effect for each arm via the arms array. Each arm specifies its true effect size and its is_active flag (true = real effect under $H_1$ ; false = null arm for FWER calibration). Set the entry_stage for each arm to model staggered entry.

Enroll by stage

For each stage $k = 1, \ldots, K$ , determine which arms are active (have entered and have not been stopped). Randomize $n_{\text{per\_stage}}$ patients per arm (including control). Track the stage at which each control patient was enrolled for non-concurrent control handling.

Generate data

Simulate endpoint data under the true parameters. For binary: $x_j \sim \text{Bin}(n_j, p_j^{\text{true}})$ . For continuous: $Y_j \sim N(\mu_j^{\text{true}}, \sigma^2)$ . For survival: generate exponential event times with the specified hazard ratio and apply censoring.

Apply stopping rules

At each interim stage, compute the test statistic or posterior probability for each active arm against its comparator control data (concurrent-only or pooled, per the chosen control strategy). Compare against the efficacy and futility boundaries. Arms exceeding the efficacy boundary graduate; arms below the futility boundary are dropped.

Aggregate across simulations

Over all $S$ simulations, compute per-arm power, per-arm type I error, family-wise error rate, expected total sample size, and the distribution of stages at which each arm is stopped. Report the fraction of simulations where each arm graduates, is dropped for futility, or reaches the final analysis.

Reproducibility: When a seed is provided, each simulation uses a deterministic RNG chain. The engine stores an input_hash (SHA-256 of all parameters) to verify that repeated runs produce identical results.

6. Operating Characteristics

When simulation is enabled, the calculator computes the following metrics across all Monte Carlo replicates:

Metric	Description
fwer	Family-wise error rate: P(at least one false graduation among null arms)
per_arm_power	Proportion of simulations where a truly active arm graduates for efficacy
per_arm_type1_error	Proportion of simulations where a truly null arm falsely graduates
expected_total_n	Expected total sample size across all arms and stages, accounting for early stopping
arm_stopping_distribution	Per-arm distribution of stopping stages: fraction stopped at each stage for efficacy, futility, or reaching the final analysis

Interpretation Guidance

•FWER under staggered entry: When arms enter at different stages, the effective number of simultaneous comparisons varies over time. The FWER metric captures the overall probability of at least one false graduation across the entire platform lifetime, accounting for the staggered structure.
•Expected sample size savings: Early futility stopping reduces the expected total sample size compared to a fixed multi-arm design without interim analyses. The expected_total_n metric quantifies this savings under the specified truth scenario.
•Stopping distribution: The arm-level stopping distribution reveals how aggressively the design stops arms. A well-calibrated design should stop most null arms early (at interim stages) while allowing most active arms to reach later stages or graduate.
•Global null calibration: Always simulate the global null scenario (all arms set to is_active: false) to verify that the FWER is controlled at or below the nominal $\alpha$ level.

7. Statistical Assumptions

All Endpoints

•Shared control arm: A single control arm is shared across all active experimental arms. Control patients are randomized concurrently with the experimental arms at each stage.
•Independence across arms: Treatment effects are assumed independent across experimental arms. There is no information borrowing between arms (each arm is evaluated against the shared control independently).
•Stationarity (concurrent control): Under the concurrent_only control strategy, the control outcome distribution is assumed constant over time. Under pooled strategies, temporal trends must be either absent or adequately modeled by the time adjustment.
•Equal per-stage sample size: Each arm receives $n_{\text{per\_stage}}$ patients per stage. The calculator does not model unequal allocation ratios across arms within a stage.
•No treatment-by-stage interaction: The treatment effect for each arm is assumed constant across stages. If the treatment effect varies over time (e.g., due to changing patient population or evolving standard of care), the analysis may be biased.

Binary Endpoint

•Independent Bernoulli responses: Each patient's outcome is an independent Bernoulli draw with probability $p_{T_j}$ (treatment) or $p_C$ (control).
•Large-sample approximation: The z-test assumes sufficient events in both arms for the normal approximation to hold at each interim look.

Continuous Endpoint

•Known common variance: All arms share the same known standard deviation $\sigma$ . In practice, $\sigma$ is estimated from pilot data; misspecification inflates or deflates power estimates.
•Normal distribution: Individual outcomes are assumed normally distributed. For non-normal data, the z-test is robust for moderate sample sizes by the central limit theorem.

Survival Endpoint

•Proportional hazards: The hazard ratio $\text{HR}_j$ is constant over time within each arm-control comparison. Violations (e.g., delayed treatment effect, crossing survival curves) invalidate the log-rank-based analysis.
•Exponential event times: The simulation assumes exponential survival distributions for both treatment and control arms. The analytical Schoenfeld formula holds more generally under proportional hazards, but the simulation uses exponential draws.
•Uniform accrual and random censoring: Patients accrue uniformly within each stage. Administrative censoring occurs at the end of follow-up. Additional random censoring (dropout) follows an exponential distribution at rate $\lambda_{\text{dropout}}$ .

8. Limitations & When Not to Use

Current Design Limitations

No information borrowing across arms: Each arm is evaluated independently against the shared control. There is no hierarchical model or information sharing between arms, even if arms test similar mechanisms of action. For designs requiring Bayesian borrowing, consider the basket trial calculator.

No dose-finding: The platform trial calculator evaluates each arm at a single fixed dose. It does not support dose-escalation, dose-response modeling, or seamless phase I/II dose-finding within an arm. Each arm is assumed to be at its pre-determined dose when it enters the platform.

Exponential survival only: The survival endpoint simulation assumes exponential event times (constant hazard). Weibull, log-normal, or piecewise-exponential survival distributions are not supported. For designs sensitive to non-proportional hazards or delayed treatment effects, external simulation may be needed.

No biomarker-driven assignment: Unlike umbrella trials, this platform calculator does not model biomarker stratification. All arms draw from the same undifferentiated patient population. For biomarker-driven platforms, combine this design with external biomarker prevalence modeling.

Fixed per-stage sample size: The number of patients per arm per stage is fixed at design time. There is no sample size re-estimation or response-adaptive randomization within stages. For response-adaptive designs, see the RAR calculator.

Maximum 8 experimental arms: The calculator supports up to 8 experimental arms plus the shared control. Larger platforms may require specialized simulation software with greater computational resources.

9. Regulatory Considerations

FDA Master Protocols Guidance (2022)

•FDA's guidance on master protocols explicitly addresses platform trials as a key subtype. The guidance recommends that the master protocol pre-specify the rules for adding and dropping arms, the interim analysis schedule, the alpha-spending strategy, and the non-concurrent control handling strategy.
•The guidance emphasizes that “the statistical analysis plan should address how the family-wise type I error rate will be controlled or characterized” when multiple experimental arms are evaluated against a common control. Sponsors should document whether FWER control is strong (across all arms) or per-arm.
•FDA requires that simulation-based operating characteristics be provided, demonstrating adequate power for each arm and quantifying the FWER under the global null. This calculator generates exactly these metrics.
•For arms that graduate from the platform, FDA may consider each arm's evidence independently for regulatory approval. The platform protocol should specify the evidentiary standard for graduation (e.g., posterior probability threshold or adjusted p-value).

Pre-Specification Requirements

•Arm entry and exit criteria: The protocol must pre-specify which arms will be evaluated, their entry stages, and the decision rules for efficacy graduation and futility dropping. Post-hoc modifications to these rules undermine the pre-specified error control.
•Non-concurrent control strategy: The choice of concurrent-only, pooled adjusted, or pooled naive control must be pre-specified with scientific justification. FDA recommends sensitivity analyses under alternative control strategies.
•Multiplicity handling: The alpha-spending function or Bonferroni division must be fully specified before the trial begins. The protocol should state whether FWER is controlled in the strong sense (across all arms simultaneously) or in a per-comparison sense.
•Adding arms mid-trial: If new arms may be added after the platform starts, the protocol must specify the amendment process, how the multiplicity adjustment will be updated, and whether previously collected control data will be used for the new arm.

Non-Concurrent Control Acceptability

•FDA generally prefers concurrent controls but recognizes that platform trials may benefit from pooling non-concurrent control data. The guidance states that “if non-concurrent controls are used, the statistical analysis plan should address potential biases.”
•Sponsors should provide evidence that the patient population and standard of care were stable across the periods being pooled. Time-trend analyses and sensitivity analyses using concurrent-only controls are recommended.

10. API Reference

POST /api/v1/calculators/platform

Platform trial design with multi-arm multi-stage boundaries, staggered arm entry, non-concurrent control strategies, and optional Monte Carlo simulation for operating characteristics.

Core Parameters

Parameter	Type	Default	Description
n_arms	int	2	Number of experimental arms, excluding control [1, 8]
arms	PlatformTrialArm[]?	null	Per-arm configuration (name, entry_stage, effect size, is_active); auto-generated if null
n_stages	int	3	Number of analysis stages (interim + final) [2, 5]
n_per_stage	int	100	Patients per arm per stage [20, 2000]
endpoint_type	string	"binary"	"binary", "continuous", or "survival"
analysis_type	string	"frequentist"	"frequentist" or "bayesian"
control_type	string	"concurrent_only"	"concurrent_only", "pooled_adjusted", or "pooled_naive"

Per-Arm Parameters (PlatformTrialArm)

Parameter	Type	Default	Description
name	string	"Arm A"	Display label for this arm
entry_stage	int	1	Stage at which this arm enters the platform (1-indexed) [1, n_stages]
response_rate	float?	0.35	Alternative response rate (binary endpoint) [0.01, 0.99]
mean_effect	float?	0.5	Alternative mean (continuous endpoint)
hazard_ratio	float?	0.7	Hazard ratio vs. control (survival endpoint) (0, 2)
is_active	bool	true	True = has real effect under H1; false = null arm (for FWER calibration)

Frequentist Parameters

Parameter	Type	Default	Description
boundary_method	string	"bonferroni"	"bonferroni" or "spending"
spending_function	string	"obrien_fleming"	"obrien_fleming" or "pocock" (when boundary_method = "spending")
alpha	float	0.025	One-sided significance level (0, 1)

Bayesian Parameters

Parameter	Type	Default	Description
efficacy_threshold	float	0.975	Posterior probability threshold for efficacy graduation (0.5, 1.0)
futility_threshold	float	0.10	Posterior probability threshold for futility dropping [0, 0.5)
prior_alpha	float	1.0	Beta prior alpha (binary endpoint) (>0)
prior_beta	float	1.0	Beta prior beta (binary endpoint) (>0)

Binary Endpoint Parameters

Parameter	Type	Default	Description
null_rate	float	0.15	Shared control response rate [0.01, 0.99]

Continuous Endpoint Parameters

Parameter	Type	Default	Description
null_mean	float	0.0	Shared control mean
common_sd	float	1.0	Common standard deviation across all arms (>0)

Survival Endpoint Parameters

Parameter	Type	Default	Description
median_control	float?	12.0	Control median survival in months (>0)
accrual_time	float?	24.0	Accrual period in months (>0)
follow_up_time	float?	12.0	Additional follow-up after last enrollment (months, >0)
dropout_rate	float?	0.0	Annual dropout rate [0, 1)

Simulation Parameters

Parameter	Type	Default	Description
simulate	bool	false	Enable Monte Carlo simulation for operating characteristics
simulation_seed	int?	null	Seed for reproducibility; auto-generated if omitted
n_simulations	int	10000	Number of Monte Carlo simulations [1000, 100000]

Example Request (Binary, Frequentist, Staggered Entry)

{
  "n_arms": 3,
  "arms": [
    {"name": "Drug A", "entry_stage": 1, "response_rate": 0.35, "is_active": true},
    {"name": "Drug B", "entry_stage": 1, "response_rate": 0.40, "is_active": true},
    {"name": "Drug C", "entry_stage": 2, "response_rate": 0.30, "is_active": true}
  ],
  "n_stages": 3,
  "n_per_stage": 100,
  "endpoint_type": "binary",
  "analysis_type": "frequentist",
  "control_type": "concurrent_only",
  "boundary_method": "spending",
  "spending_function": "obrien_fleming",
  "null_rate": 0.15,
  "alpha": 0.025,
  "simulate": true,
  "n_simulations": 10000,
  "simulation_seed": 42
}

Example Request (Survival, Bayesian, Pooled Control)

{
  "n_arms": 2,
  "arms": [
    {"name": "Immunotherapy", "entry_stage": 1, "hazard_ratio": 0.65, "is_active": true},
    {"name": "Targeted Agent", "entry_stage": 2, "hazard_ratio": 0.75, "is_active": true}
  ],
  "n_stages": 4,
  "n_per_stage": 80,
  "endpoint_type": "survival",
  "analysis_type": "bayesian",
  "control_type": "pooled_adjusted",
  "median_control": 12.0,
  "accrual_time": 36.0,
  "follow_up_time": 12.0,
  "dropout_rate": 0.05,
  "efficacy_threshold": 0.975,
  "futility_threshold": 0.10,
  "simulate": true,
  "n_simulations": 10000
}

Response Fields

Field	Description
analytical_results.endpoint_type	Endpoint type used ("binary", "continuous", or "survival")
analytical_results.analysis_type	Analysis type ("frequentist" or "bayesian")
analytical_results.n_arms	Number of experimental arms
analytical_results.n_stages	Number of analysis stages
analytical_results.n_per_stage	Patients per arm per stage
analytical_results.control_type	Non-concurrent control strategy applied
analytical_results.total_n_max	Maximum total sample size if all arms reach the final stage
analytical_results.boundary_table	Stage-by-stage efficacy and futility boundaries for each arm
analytical_results.per_arm	Per-arm summary: name, entry stage, effect size, cumulative N, test statistic, decision
analytical_results.design_summary	Human-readable summary of the design configuration
analytical_results.regulatory_notes	FDA/EMA guidance citations and recommendations
simulation_results	Monte Carlo OC when simulate=true (fwer, per_arm_power, per_arm_type1_error, expected_total_n, arm_stopping_distribution)
metadata	Engine version, input hash, computation time

11. References

Saville BR, Berry SM. Efficiencies of platform clinical trials: A vision of the future. Clinical Trials. 2016;13(3):358-366.
Royston P, Parmar MKB, Qian W. Novel designs for multi-arm clinical trials with survival outcomes with an application in ovarian cancer. Statistics in Medicine. 2003;22(14):2239-2256.
Parmar MKB, et al. Testing many treatments within a single protocol over 10 years at MRC Clinical Trials Unit. Clinical Trials. 2014;11(6):697-703.
Angus DC, et al. Effect of hydrocortisone on mortality and organ support in patients with severe COVID-19 (REMAP-CAP). JAMA. 2020;324(13):1317-1329.
Sydes MR, et al. Flexible trial design in practice — stopping arms for lack-of-benefit and adding research arms mid-trial in STAMPEDE. Trials. 2012;13:168.
U.S. Food and Drug Administration. Master Protocols: Efficient Clinical Trial Design Strategies to Expedite Development of Oncology Drugs and Biologics: Guidance for Industry. March 2022.
Landovitz RJ, et al. Cabotegravir for HIV prevention in cisgender men and transgender women (HPTN 083). NEJM. 2021;385(7):595-608.
Herbst RS, et al. Lung Master Protocol (Lung-MAP) — a biomarker-driven protocol for accelerating development of therapies for squamous cell lung cancer. Clinical Cancer Research. 2015;21(7):1514-1524.

Last updated: May 2026