Documentation/Technical Reference

Basket Trial Design

Multi-indication single-treatment trials with Bayesian information borrowing. Independent analysis, BHM (Berry et al. 2013), and EXNEX (Neuenschwander et al. 2016) methods with Monte Carlo operating characteristics.

1. Overview & Motivation

A basket trial enrolls patients across multiple disease indications (baskets) who share a common molecular alteration or biomarker, and treats them all with the same investigational therapy. The central question is whether the treatment works uniformly across indications or whether efficacy varies by tumor type or disease context.

Tissue-Agnostic Paradigm

FDA approved the first tissue-agnostic indication (pembrolizumab for MSI-H/dMMR tumors) in 2017, followed by larotrectinib (NTRK fusions) in 2018 and entrectinib in 2019. Basket trials were the primary evidentiary vehicle for these approvals.

Why Borrow Information?

Individual baskets are often small (15–30 patients). By sharing information across baskets via hierarchical models, we gain statistical efficiency when the treatment effect is homogeneous, while protecting against false conclusions when it is heterogeneous.

Key distinction: A basket trial tests one treatment across many indications. An umbrella trial tests many treatments within one indication stratified by biomarker. A platform trial adds or drops arms adaptively over time. This calculator addresses the basket design specifically.

When to Use a Basket Trial

•Biomarker-defined population: A common molecular target (e.g., BRAF V600E, NTRK fusion, MSI-H) is present across multiple tumor types or disease indications.
•Small per-indication populations: No single indication has enough patients for a fully powered standalone trial, but the combined population does.
•Biological plausibility for shared mechanism: There is preclinical or early clinical evidence suggesting the treatment may work through the same pathway regardless of tissue of origin.
•Regulatory efficiency: A single protocol can support multiple indication-specific submissions, potentially with tissue-agnostic labeling if responses are consistent.

2. Independent Analysis

Beta-Binomial Conjugate Model

The simplest approach analyzes each basket independently with no information borrowing. Each basket $k = 1, \ldots, K$ has its own response rate $p_k$ with a Beta prior and Bernoulli likelihood, yielding a closed-form Beta posterior:

p_k \mid x_k, n_k \;\sim\; \text{Beta}(\alpha + x_k,\; \beta + n_k - x_k)

where $x_k$ is the number of responders in basket $k$ , $n_k$ is the basket sample size, and $\alpha, \beta$ are the shared Beta prior hyperparameters (default: $\alpha = \beta = 1$ , i.e., uniform prior).

Decision Rule

A “Go” decision is declared for basket $k$ when the posterior probability of exceeding the null response rate exceeds the decision threshold:

P(p_k > p_{0,k} \mid \text{data}) > \gamma

where $p_{0,k}$ is the null hypothesis response rate for basket $k$ and $\gamma$ is the decision threshold (default 0.95). The posterior exceedance probability is computed as $1 - F_{\text{Beta}}(p_{0,k};\; \alpha + x_k,\; \beta + n_k - x_k)$ where $F_{\text{Beta}}$ is the regularized incomplete beta function.

No borrowing baseline: The independent method serves as the reference comparator. It controls per-basket Type I error at (approximately) the level implied by $\gamma$ , but has low power when individual baskets are small. BHM and EXNEX attempt to improve power by borrowing strength across baskets.

3. BHM — Berry et al. 2013

Logit-Normal Hierarchical Model

The Bayesian Hierarchical Model (BHM) proposed by Berry et al. (2013) assumes that the basket-level response rates are exchangeable — drawn from a common distribution. The model operates on the logit scale:

\theta_k = \text{logit}(p_k), \quad \theta_k \mid \mu, \tau^2 \;\sim\; N(\mu, \tau^2)

where $\mu$ is the overall mean log-odds and $\tau^2$ is the between-basket variance. Small $\tau^2$ implies homogeneous baskets (strong borrowing); large $\tau^2$ implies heterogeneity (weak borrowing, approaching independent analysis).

Empirical Bayes Estimation

The implementation uses an empirical Bayes (EB) approach rather than a fully Bayesian analysis with MCMC. The hyperparameters $(\mu, \tau^2)$ are estimated by maximizing the marginal likelihood:

L(\mu, \tau^2) = \prod_{k=1}^{K} \mathcal{N}(y_k;\; \mu,\; s_k^2 + \tau^2)

where $y_k = \text{logit}(\hat{p}_k)$ is the observed logit-transformed response rate and $s_k^2$ is the estimated sampling variance of $y_k$ , computed via the delta method as $s_k^2 = 1/(n_k \hat{p}_k (1-\hat{p}_k))$ . The EB estimates $(\hat{\mu}, \hat{\tau}^2)$ are found by numerical optimization.

Shrinkage

Given the EB estimates, the posterior mean for each basket is a weighted average of the basket-specific estimate and the grand mean, with shrinkage weight:

B_k = \frac{\hat{\tau}^2}{\hat{\tau}^2 + s_k^2}

\hat{\theta}_k^{\text{BHM}} = B_k \cdot y_k + (1 - B_k) \cdot \hat{\mu}

When $B_k \approx 1$ (large $\tau^2$ or small $s_k^2$ ), the basket retains its own estimate. When $B_k \approx 0$ (small $\tau^2$ , large $s_k^2$ ), the basket is pulled toward the grand mean. The posterior rate is then $\hat{p}_k^{\text{BHM}} = \text{expit}(\hat{\theta}_k^{\text{BHM}})$ .

Heterogeneity Assessment

The calculator reports Cochran's Q statistic and the $I^2$ index to quantify between-basket heterogeneity:

Q = \sum_{k=1}^{K} w_k (y_k - \hat{\mu}_{\text{FE}})^2, \quad w_k = 1/s_k^2

I^2 = \max\!\left(0,\; \frac{Q - (K-1)}{Q} \times 100\right)

$I^2 < 25\%$ suggests low heterogeneity (strong borrowing appropriate); $I^2 > 75\%$ suggests substantial heterogeneity (borrowing may be harmful). The Q test p-value is also reported based on the $\chi^2_{K-1}$ distribution.

Decision Rule under BHM

The Go/No-Go decision still uses the same threshold rule, but the posterior is computed from the shrinkage-adjusted estimate. The posterior variance for basket $k$ on the logit scale is:

\text{Var}(\theta_k \mid \text{data}) = \frac{1}{1/s_k^2 + 1/\hat{\tau}^2}

The posterior exceedance probability is then:

P(p_k > p_{0,k} \mid \text{data}) = 1 - \Phi\!\left(\frac{\text{logit}(p_{0,k}) - \hat{\theta}_k^{\text{BHM}}}{\sqrt{\text{Var}(\theta_k \mid \text{data})}}\right)

where $\Phi$ is the standard normal CDF. This approximation relies on the logit-normal being approximately normal on the logit scale, which holds well for moderate sample sizes and rates away from 0 or 1.

4. EXNEX — Neuenschwander et al. 2016

Exchangeability / Non-Exchangeability Mixture

EXNEX (Neuenschwander et al. 2016) is a robust extension of the BHM that hedges between exchangeability (information borrowing) and non-exchangeability (independent analysis). Each basket's posterior is a mixture of two components:

f(p_k \mid \text{data}) = w_k \cdot f_{\text{BHM}}(p_k \mid \text{data}) + (1 - w_k) \cdot f_{\text{indep}}(p_k \mid \text{data})

where $w_k \in [0, 1]$ is the exchangeability weight for basket $k$ . Setting $w_k = 1$ recovers the full BHM; $w_k = 0$ recovers the independent analysis.

Why EXNEX?

The BHM assumes all baskets are exchangeable. If one basket has a fundamentally different biology (e.g., the biomarker does not drive the tumor in that indication), the BHM still borrows information toward it, inflating Type I error for truly inactive baskets and diluting power for truly active ones. EXNEX mitigates this by allowing each basket to “opt out” of the exchangeable cluster with probability $1 - w_k$ .

Implementation

The EXNEX posterior combines the BHM and independent components:

•EX component: The BHM posterior $f_{\text{BHM}}(p_k)$ with shrinkage toward the grand mean, exactly as described in Section 3.
•NEX component: The independent Beta-Binomial posterior $f_{\text{indep}}(p_k) = \text{Beta}(\alpha + x_k,\; \beta + n_k - x_k)$ with no borrowing.
•Mixture: The decision posterior probability is $P_{\text{EXNEX}}(p_k > p_{0,k}) = w_k \cdot P_{\text{BHM}}(p_k > p_{0,k}) + (1 - w_k) \cdot P_{\text{indep}}(p_k > p_{0,k})$ .

Choosing exchangeability weights: The weights $w_k$ are pre-specified (not estimated from data). Default is $w_k = 0.5$ for all baskets. Baskets with strong prior biological plausibility can use higher $w_k$ (e.g., 0.7–0.9); baskets with uncertain mechanism can use lower $w_k$ (e.g., 0.2–0.3). Sensitivity analysis across a range of weights is recommended.

Robustness: EXNEX provides a middle ground between the fully exchangeable BHM (aggressive borrowing, risk of inflated Type I error under heterogeneity) and fully independent analysis (conservative, low power). Simulation studies show EXNEX with $w_k = 0.5$ controls FWER within acceptable bounds across a wide range of heterogeneity scenarios.

5. Simulation Algorithm

Monte Carlo Operating Characteristics

The basket trial simulator estimates operating characteristics by repeating the full analysis procedure across many simulated datasets. Each simulation draws outcomes from the user-specified true response rates and applies the chosen analysis method (independent, BHM, or EXNEX) to produce Go/No-Go decisions.

Specify the truth

Define the true response rate for each basket via $p_k^{\text{true}}$ . The alternative_rates parameter sets these values. Baskets where $p_k^{\text{true}} > p_{0,k}$ are truly active; baskets where $p_k^{\text{true}} = p_{0,k}$ are truly inactive (null).

Generate data

For each simulation $s = 1, \ldots, S$ , draw $x_k^{(s)} \sim \text{Binomial}(n_k, p_k^{\text{true}})$ independently for each basket $k$ .

Apply analysis method

Compute the posterior for each basket using the selected method (independent, BHM, or EXNEX). For BHM and EXNEX, the hierarchical model is re-fit on each simulated dataset, re-estimating $(\hat{\mu}^{(s)}, \hat{\tau}^{2(s)})$ via empirical Bayes.

Apply decision rule

For each basket, check whether $P(p_k > p_{0,k} \mid \text{data}^{(s)}) > \gamma$ . Record a Go decision (rejection) if the condition is met.

Aggregate metrics

Over all $S$ simulations, compute per-basket power (for active baskets), per-basket Type I error (for null baskets), family-wise error rate (FWER), false discovery rate (FDR), and expected number of Go decisions.

Reproducibility: When a seed is provided, each simulation uses a deterministic RNG chain. The engine stores an input_hash (SHA-256 of all parameters) to verify that repeated runs produce identical results.

6. Operating Characteristics

When simulation is enabled, the calculator computes the following metrics across all Monte Carlo replicates:

Metric	Description
per_basket_power	Proportion of simulations where a truly active basket receives a Go decision
per_basket_type1_error	Proportion of simulations where a truly null basket receives a false Go decision
fwer	Family-wise error rate: P(at least one false Go among null baskets)
fdr	False discovery rate: E[false Go / total Go], averaged over simulations with at least one Go
mean_go_decisions	Expected number of Go decisions per trial
mean_correct_go	Expected number of true-positive Go decisions per trial

Interpretation Guidance

•FWER control: The independent method controls per-basket error at the level implied by $\gamma$ , but does not formally control FWER. BHM can inflate FWER when some baskets are null and others are active, because borrowing from active baskets inflates the posterior for null baskets.
•Power-FWER tradeoff: Stronger borrowing (BHM with homogeneous baskets) yields higher per-basket power but potentially higher FWER under heterogeneity. EXNEX with moderate $w_k$ provides a compromise.
•Scenario sensitivity: Results depend heavily on which baskets are truly active vs. null. Always simulate multiple scenarios (all active, all null, mixed) to understand the design's behavior across plausible truths.

7. Statistical Assumptions

All Methods

•Binary endpoint: All baskets use a binary (responder/non-responder) endpoint. Continuous, ordinal, and time-to-event endpoints are not supported in this calculator.
•Single-arm per basket: Each basket is a single-arm study compared to a fixed null rate $p_{0,k}$ . There is no concurrent control arm within each basket.
•Fixed sample sizes: The sample size per basket $n_k$ is pre-specified and not adaptive. Interim analyses with early stopping are not modeled.
•Shared prior: All baskets share the same Beta prior $\text{Beta}(\alpha, \beta)$ . The default $\alpha = \beta = 1$ is a uniform (non-informative) prior.

BHM-Specific

•Exchangeability: The BHM assumes all basket-level parameters are exchangeable (drawn from a common distribution). This is the core borrowing assumption. Violations (e.g., one basket with fundamentally different biology) can inflate error rates.
•Empirical Bayes vs. full Bayes: The hyperparameters $(\mu, \tau^2)$ are point-estimated via marginal likelihood maximization, not integrated over with a prior. This underestimates posterior uncertainty (narrower credible intervals). For small $K$ (2–3 baskets), the EB estimate of $\tau^2$ can be unstable.
•Logit-normal approximation: The posterior on the logit scale is assumed normal. This is a good approximation when $n_k \hat{p}_k$ and $n_k(1-\hat{p}_k)$ are both at least 5. Baskets with very low or very high observed rates may have poorly calibrated posteriors.

EXNEX-Specific

•Pre-specified weights: The exchangeability weights $w_k$ are fixed prior to analysis, not learned from data. The choice of weights reflects prior knowledge about whether each basket belongs to the exchangeable cluster.
•Two-component mixture: EXNEX uses exactly two components (EX and NEX). Extensions to multiple exchangeability clusters (e.g., grouping baskets by histology) are not implemented.

8. Limitations & When Not to Use

When a Basket Design May Not Be Appropriate

Continuous or time-to-event endpoints: This calculator supports only binary (response/no-response) endpoints. For continuous outcomes or survival endpoints, the BHM/EXNEX framework requires different likelihood specifications not implemented here.

Very small baskets (n < 10): With fewer than 10 patients per basket, the logit-normal approximation used in BHM/EXNEX becomes unreliable. The delta-method variance $s_k^2 = 1/(n_k \hat{p}_k(1-\hat{p}_k))$ diverges when observed rates approach 0 or 1, which is common with very small samples. The calculator requires a minimum of 5 patients per basket.

Highly heterogeneous populations: If the biological mechanism is known to differ substantially across indications, borrowing information can be harmful. In these cases, the BHM inflates Type I error for null baskets and dilutes power for active baskets. Use independent analysis or EXNEX with low $w_k$ .

Confirmatory (phase III) trials: Basket trials are primarily used in phase II for signal detection. For confirmatory evidence, regulators typically require indication-specific randomized controlled trials, though tissue-agnostic accelerated approvals have occurred based on compelling basket trial data (e.g., NTRK fusions).

Two baskets only: With $K = 2$ , the BHM estimates of $\tau^2$ are poorly identified. The empirical Bayes estimate may collapse to zero (full pooling) or infinity (no pooling) without intermediate values. Three or more baskets are recommended for stable hierarchical estimation.

Randomized basket trials: This calculator models single-arm baskets compared to fixed historical null rates. If the design includes concurrent control arms within each basket, a different analytical framework is needed.

9. Regulatory Considerations

FDA Guidance

•FDA's Master Protocols guidance (2022) explicitly addresses basket trials and recommends pre-specifying the borrowing method, decision criteria, and simulation-based operating characteristics in the protocol and SAP.
•The guidance states that information borrowing “should be based on a clearly stated assumption of exchangeability” and recommends sensitivity analyses under both borrowing and no-borrowing scenarios.
•For tissue-agnostic claims, FDA expects a “consistent and clinically meaningful response” across indications. Berry et al. (2013) heterogeneity assessment (Q statistic, $I^2$ ) provides formal tools for evaluating consistency.
•FDA has granted accelerated approvals based on basket trial data for MSI-H/dMMR (pembrolizumab), NTRK fusions (larotrectinib, entrectinib), and BRAF V600E (dabrafenib + trametinib), establishing precedent for the approach.

EMA Considerations

•EMA's reflection paper on use of extrapolation (2018) discusses information borrowing but with more caution than FDA, emphasizing the need for biological plausibility and pre-specification.
•European regulatory framework generally requires indication-specific evidence, though conditional marketing authorizations have been granted based on basket trial data in oncology.

Information Borrowing

•Both FDA and EMA expect simulation-based operating characteristics showing FWER control (or at least quantification) under relevant scenarios, including heterogeneous null/alternative configurations.
•Pre-specification of the borrowing method is critical. Post-hoc selection of the method that gives the best result is considered data dredging and will not be accepted.
•Regulators recommend running the analysis under both borrowing and independent approaches. If conclusions differ materially, the independent analysis is typically given more weight.

10. API Reference

POST /api/v1/calculators/basket

Basket trial analysis with independent, BHM, or EXNEX methods and optional Monte Carlo simulation for operating characteristics.

Request Parameters

Parameter	Type	Default	Description
method	string	"independent"	"independent", "bhm", or "exnex"
n_baskets	int	4	Number of baskets (indications) [2, 10]
basket_names	string[]?	null	Optional labels for each basket (length = n_baskets)
n_per_basket	int[]	[24] × K	Sample size per basket [5, 500] each
null_rates	float[]	[0.15] × K	Null hypothesis response rate per basket (0, 1)
alternative_rates	float[]	[0.40]×(K-1), [0.15]	True response rate per basket for simulation (0, 1)
decision_threshold	float	0.95	Posterior probability threshold for Go decision (0.5, 1.0)
prior_alpha	float	1.0	Beta prior alpha (shared across baskets) (>0)
prior_beta	float	1.0	Beta prior beta (shared across baskets) (>0)
w_ex	float[]?	[0.5] × K	EXNEX only: exchangeability weight per basket [0, 1]
simulate	bool	false	Enable Monte Carlo simulation for OC
simulation_seed	int?	null	Seed for reproducibility; auto-generated if omitted
n_simulations	int	10000	Number of Monte Carlo simulations [1000, 100000]

Example Request (Independent)

{
  "method": "independent",
  "n_baskets": 4,
  "n_per_basket": [24, 24, 24, 24],
  "null_rates": [0.15, 0.15, 0.15, 0.15],
  "alternative_rates": [0.40, 0.40, 0.40, 0.15],
  "decision_threshold": 0.95,
  "prior_alpha": 1.0,
  "prior_beta": 1.0,
  "simulate": true,
  "n_simulations": 10000
}

Example Request (BHM)

{
  "method": "bhm",
  "n_baskets": 5,
  "basket_names": ["NSCLC", "CRC", "Melanoma", "Thyroid", "Cholangiocarcinoma"],
  "n_per_basket": [30, 30, 30, 20, 20],
  "null_rates": [0.10, 0.10, 0.10, 0.10, 0.10],
  "alternative_rates": [0.35, 0.35, 0.35, 0.10, 0.10],
  "decision_threshold": 0.95,
  "simulate": true,
  "n_simulations": 10000
}

Example Request (EXNEX)

{
  "method": "exnex",
  "n_baskets": 4,
  "n_per_basket": [25, 25, 25, 25],
  "null_rates": [0.15, 0.15, 0.15, 0.15],
  "alternative_rates": [0.40, 0.40, 0.15, 0.15],
  "decision_threshold": 0.95,
  "w_ex": [0.7, 0.7, 0.3, 0.3],
  "simulate": true,
  "n_simulations": 10000
}

Response Fields

Field	Description
analytical_results.method	Analysis method used ("independent", "bhm", or "exnex")
analytical_results.n_baskets	Number of baskets in the design
analytical_results.basket_names	Labels for each basket
analytical_results.per_basket	Per-basket posterior summaries: mean, credible interval, exceedance probability, Go/No-Go
analytical_results.n_go_decisions	Number of baskets receiving a Go decision
analytical_results.heterogeneity	BHM/EXNEX only: Q statistic, I², tau², p-value
analytical_results.shrinkage_weights	BHM/EXNEX only: shrinkage factor B_k per basket
analytical_results.design_summary	Human-readable summary of the design configuration
analytical_results.regulatory_notes	FDA/EMA guidance citations and recommendations
simulation_results	Monte Carlo OC when simulate=true (power, FWER, FDR, etc.)
metadata	Engine version, input hash, computation time

11. References

Berry SM, Broglio KR, Groshen S, Berry DA. Bayesian hierarchical modeling of patient subpopulations: Efficient designs of Phase II oncology clinical trials. Clinical Trials. 2013;10(5):720-734.
Neuenschwander B, Wandel S, Roychoudhury S, Bailey S. Robust exchangeability designs for early phase clinical trials with multiple strata. Pharmaceutical Statistics. 2016;15(2):123-134.
Hobbs BP, et al. Basket trials: Review of current practice and innovations for future trials. Journal of Clinical Oncology. 2022;40(30):3520-3528.
Woodcock J, LaVange LM. Master protocols to study multiple therapies, multiple diseases, or both. NEJM. 2017;377(1):62-70.
Berry SM, Carlin BP, Lee JJ, Muller P. Bayesian Adaptive Methods for Clinical Trials. CRC Press; 2010.
U.S. Food and Drug Administration. Master Protocols: Efficient Clinical Trial Design Strategies to Expedite Development of Oncology Drugs and Biologics: Guidance for Industry. March 2022.

Last updated: May 2026