Docs/Tutorials/Bayesian Workflow

Bayesian Clinical Trial Design: From Prior to Approval

Name: Zetyra
Price: 99 USD
Rating: 4.9 (47 reviews)
Author: Zetyra

A comprehensive end-to-end tutorial using Zetyra's Bayesian Toolkit with FDA-approved case studies. Walk through all six toolkit steps—from eliciting a prior to monitoring an ongoing trial—using real data from REBYOTA, oncology dose-finding, and adaptive platform trials.

Section	Case Study	Endpoint	Key Method
Prior Elicitation & Borrowing	REBYOTA (PUNCH CD3)	Binary	Power prior, historical borrowing
Dose-Finding	Oncology Phase I	Toxicity	BOIN design
Survival Endpoints	DLBCL Lymphoma	Time-to-event	Commensurate prior, Weibull model
Adaptive Randomization	GBM AGILE	Survival	Response-adaptive allocation
Sequential Monitoring	REBYOTA (extended)	Binary	Posterior probability boundaries
Interim Monitoring	All	Various	PPoS, predictive probability

1. The Regulatory Landscape

FDA's Bayesian Framework

The FDA's January 2026 guidance on Bayesian methodology marks a watershed moment for clinical trial design. The FDA now explicitly endorses Bayesian methods for:

Governing adaptation rules for interim analyses

Informing design elements (e.g., dose selection) for subsequent trials

Supporting primary inference in registration trials

Augmenting concurrent controls with external or historical data

Key Regulatory Programs

Program	Purpose	Contact Point
Complex Innovative Trial Design (CID)	Adaptive/Bayesian design meetings	CDER/CBER
Center for Clinical Trial Innovation (C3TI)	Non-adaptive Bayesian demonstrations	C3TI portal
Rare Disease Program	Small-sample Bayesian approaches	Office of Orphan Products

Zetyra Toolkit Overview

Step	Calculator	Use Case	Key Outputs
1	Prior Elicitation	Historical data → informative prior	Beta parameters, ESS, prior predictive
2	Bayesian Borrowing	Multi-study synthesis, MAP priors	Discount comparison, conflict diagnostics
3	Sample Size (Single-Arm)	Power/assurance calculations	N, operating characteristics, power curves
4	Two-Arm Design	RCT with Bayesian borrowing	Frequentist comparison, efficiency gain
5	Sequential Monitoring	Interim stopping rules	Stopping boundaries, ASN curves
6	Predictive Power (PPoS)	Interim monitoring	Go/no-go thresholds, sensitivity

2. REBYOTA: Binary Endpoints with Historical Borrowing

Background

REBYOTA (fecal microbiota, live-jslm) was approved November 2022 for preventing C. difficile recurrence—the first FDA-approved microbiota-based live biotherapeutic. The pivotal PUNCH CD3 trial used Bayesian hierarchical borrowing from the Phase 2b PUNCH CD2 trial.

Why Bayesian? Widespread availability of FMT under enforcement discretion made placebo-controlled enrollment increasingly difficult. FDA recommended formal Bayesian borrowing to enable a feasible trial.

Trial	Phase	Design	Key Results
PUNCH CD2	2b	RCT (2 arms + placebo)	56.8% success (1-dose) vs 43.2% (placebo)
PUNCH CD3	3	RCT with Bayesian borrowing	70.6% vs 57.5%, P(superiority) = 99.1%

Step 1: Prior Elicitation

Goal: Build a prior for treatment success rate using PUNCH CD2 data.

Historical Data (PUNCH CD2, single-dose ITT): Treatment: 25/45 successes (55.6%), Placebo: 19/44 successes (43.2%).

Calculator Inputs

Parameter	Value	Rationale
Method	Historical Data	Phase 2b results available
Events (k)	25	Successes in CD2 treatment arm
Total (n)	45	Patients in CD2 treatment arm
Discount factor ( $\delta$ )	0.5	Skeptical borrowing—Phase 2b to 3 uncertainty

Why $\delta = 0.5$ ?

1. Phase 2b often shows inflated effects (smaller N, selected sites)

2. PUNCH CD3 had broader eligibility (≥1 vs ≥2 recurrences)

3. Regulatory conservatism—better to be pleasantly surprised

Calculator Output

Prior Distribution: Beta(13.5, 11.5)

Mean: 54.0% Median: 54.0%

95% CI: [36.6%, 70.7%]

ESS: 25 (50% of original 50)

Interpretation: Expects ~54% success with substantial uncertainty, reflecting discounted Phase 2b evidence.

Step 2: Sample Size with Bayesian Borrowing

Clinical context: Null rate ( $\theta_0$ ) = 45%, Alternative rate ( $\theta_1$ ) = 65%, Decision threshold: $P(\theta > \theta_0 \mid \text{Data}) \geq 97.5\%$ .

Parameter	Value
Prior	Beta(13.5, 11.5)
Null rate	0.45
Alternative rate	0.65
Decision threshold	0.975
Target power	0.80

Recommended Sample Size: N = 45

Type I Error: 0.032 (≤ 0.05 ✔)

Power: 0.81 (≥ 0.80 ✔)

Decision Rule: Declare success if P(θ > 0.45 | Data) ≥ 0.975. At N=45: Need ≥27 successes (60.0%)

Power Curve

True Success Rate	Power
45% (null)	3.2%
50%	12%
55%	30%
60%	56%
65% (alternative)	81%
70%	95%

Step 3: Two-Arm Comparison (Borrowing vs. Traditional)

Actual PUNCH CD3 Design: 2:1 randomization (treatment:placebo), Treatment arm N = 180, Placebo arm N = 87, Total N = 267.

Design	Treatment N	Control N	Total	Reduction
Frequentist RCT (no borrowing)	220	110	330	—
Bayesian + Borrowing (actual)	180	87	267	19%

Key Insight: Bayesian borrowing saved 63 patients—critical for a difficult-to-enroll population.

3. Oncology Dose-Finding with BOIN

REBYOTA used a fixed dose. For oncology Phase I trials seeking the Maximum Tolerated Dose (MTD), the Bayesian Optimal Interval (BOIN) design is FDA's preferred model-assisted approach. BOIN uses pre-calculated decision boundaries—no real-time Bayesian computation—while achieving optimal statistical properties.

BOIN Boundaries (Target DLT = 30%)

Patients Treated	Escalate if DLTs ≤	De-escalate if DLTs ≥
3	0	2
6	1	3
9	2	4
12	3	5
15	4	6

R Implementation

library(BOIN)

# Design parameters
target_dlt <- 0.30    # Target DLT rate (MTD definition)
ncohort <- 10         # Maximum cohorts
cohortsize <- 3       # Patients per cohort
n_doses <- 5          # Number of dose levels

# Step 1: Get decision boundaries
boundaries <- get.boundary(
  target = target_dlt,
  ncohort = ncohort,
  cohortsize = cohortsize
)

# Step 2: Simulate operating characteristics
true_dlt <- c(0.05, 0.10, 0.25, 0.35, 0.50)
oc <- get.oc(
  target = target_dlt,
  p.true = true_dlt,
  ncohort = ncohort,
  cohortsize = cohortsize,
  ntrial = 10000
)
# Dose 3 (25% DLT) selected ~55% of time
# <10% patients at doses above MTD

4. Survival Endpoints with Commensurate Priors (DLBCL)

REBYOTA's endpoint was binary (recurrence yes/no at 8 weeks). For oncology trials with Overall Survival (OS) or Progression-Free Survival (PFS) endpoints, we need time-to-event models. Diffuse Large B-Cell Lymphoma (DLBCL) trials use Bayesian commensurate priors with Weibull models to incorporate external control data.

The Commensurate Prior Framework

For concurrent control parameter $\theta_c$ and external control parameter $\theta_e$ :

\theta_c \mid \theta_e, \tau \sim N(\theta_e, \tau^{-1})

The commensurability parameter $\tau$ automatically adapts borrowing: $\tau \to \infty$ means full borrowing (external ≈ concurrent), $\tau \to 0$ means no borrowing (data conflict detected).

R Implementation with psborrow2

library(psborrow2)
library(cmdstanr)

analysis <- create_analysis_obj(
  data_matrix = combined_data,
  outcome = outcome_surv_exponential(
    time_var = "os_time",
    cens_var = "os_event",
    baseline_prior = prior_normal(0, 1000)
  ),
  borrowing = borrowing_hierarchical_commensurate(
    ext_flag_col = "is_external",
    tau_prior = prior_half_cauchy(0, 0.5)
  ),
  treatment = treatment_details(
    trt_flag_col = "treatment_arm",
    trt_prior = prior_normal(0, 2.5)
  )
)

result <- mcmc_sample(analysis,
  iter_warmup = 2000, iter_sampling = 4000, chains = 4
)
# Key outputs: HR posterior, P(HR < 1), ESS borrowed

5. Adaptive Randomization (GBM AGILE)

GBM AGILE (Glioblastoma Adaptive Global Innovative Learning Environment) is a phase 2/3 Bayesian adaptive platform trial—distinct from REBYOTA's fixed randomization. Multiple experimental arms are tested against a common control with Bayesian response-adaptive randomization within disease subtypes.

GBM AGILE Results: Regorafenib Arm

Subtype	Patients	Mean HR	P(HR < 1.0)	Decision
Recurrent	85	1.07	0.43	No benefit
Newly Diagnosed	91	1.12	0.24	No benefit
Overall	176	1.10	0.24	Discontinued

The Bayesian framework provides direct probability statements—none approached the 0.98 efficacy threshold, making discontinuation straightforward.

6. Sequential Monitoring

Why Sequential Monitoring? PPoS (Step 6) answers “how likely is this trial to succeed?” but doesn't formally control Type I error. Sequential Monitoring (Step 5) provides pre-specified stopping boundaries that can serve as the primary monitoring framework in the SAP.

PPoS vs. Sequential Monitoring

Aspect	PPoS (Step 6)	Sequential Monitoring (Step 5)
When to use	Ad-hoc interim decisions	Pre-planned stopping rules
Error control	Not formally controlled	Formal Type I error control
Output	Go/no-go probability	Z-score boundaries per look
Regulatory role	Supplementary	Can be primary monitoring
Prior dependency	Strong — drives PPoS calculation	Moderate — affects boundary shape

Three Bayesian Sequential Approaches

1. Posterior Probability (PP) — Implemented

Stop when $P(\theta > 0 \mid \text{data}) \geq \gamma$ . Analytical z-score boundaries exist for Normal-Normal conjugate models via the Zhou & Ji (2024) formula:

c_k = \Phi^{-1}(\gamma)\sqrt{1 + \frac{\sigma^2}{n_k \nu^2}} - \frac{\mu\sqrt{\sigma^2}}{\sqrt{n_k}\,\nu^2}

where $\mu, \nu^2$ are the prior mean and variance, $\sigma^2$ is the data variance, and $n_k$ is the cumulative sample size at look $k$ .

2. Posterior Predictive Probability (PPP)

Asks “Given current data, will the final analysis succeed?” More conservative early, permissive late. Resembles stochastically curtailed testing.

3. Decision-Theoretic (DT)

Explicit loss functions for Type I/Type II errors. Optimal boundaries via backward induction. Most flexible but requires loss specification.

REBYOTA Sequential Design Example

Extending the REBYOTA design with Bayesian sequential monitoring. We add 3 planned analyses at 50%, 75%, and 100% of information to allow early stopping for either efficacy or futility.

Calculator Inputs

Parameter	Value	Rationale
Endpoint type	Continuous (Normal-Normal)	Difference in success rates
N per look	[45, 68, 90]	50%, 75%, 100% of N=90 per arm
Prior mean	0.0	Non-informative starting point
Prior variance	1.0	Moderate prior uncertainty
Data variance	1.0	Standardized scale
Efficacy threshold ( $\gamma$ )	0.975	Stop if P(θ > 0 \| data) ≥ 97.5%
Futility threshold	0.10	Stop if P(θ > 0 \| data) ≤ 10%

Expected Outputs

The calculator produces z-score boundaries at each look. Key structural properties:

Efficacy boundaries monotonically decrease with accumulating data

Futility boundaries are always below efficacy boundaries at each look

With vague priors, boundaries converge to the frequentist z-critical (1.96)

Informative priors shift boundaries — a positive prior mean lowers the evidence bar

Open Sequential Monitoring Calculator Technical Documentation

7. Predictive Power & Interim Monitoring

PPoS Framework

Predictive Probability of Success (PPoS) answers: “Given current data, what's the probability the trial will succeed if continued?”

REBYOTA Interim Example (Hypothetical)

Setup: Interim at 50% enrollment (N=135). Treatment: 50/90 (55.6%), Placebo: 22/45 (48.9%).

PPoS: 87%

Decision: CONTINUE (20% ≤ PPoS < 90%)

Sensitivity by Prior:

Skeptical (ESS=4): 79%

Moderate (ESS=25): 87%

Enthusiastic (ESS=45): 94%

Conclusion: Robust across priors — continue enrollment.

Decision Thresholds by Endpoint Type

Endpoint	Futility (Stop)	Continue	Efficacy (Stop)
Binary (REBYOTA)	PPoS < 20%	20–90%	PPoS ≥ 90% + P(δ>0) ≥ 99%
Survival (Oncology)	PPoS < 10%	10–95%	PPoS ≥ 95% + HR CrI excludes 1
Dose-Finding (BOIN)	P(current > MTD) > 95%	—	MTD identified with ≥6 patients

8. Implementation: R and Python

Complete R Workflow

# ============================================
# BAYESIAN CLINICAL TRIAL ANALYSIS TOOLKIT
# ============================================
library(BOIN)        # Dose-finding
library(psborrow2)   # Historical borrowing
library(gsDesign)    # Sample size
library(rstan)       # General Bayesian

# --- PRIOR ELICITATION ---
elicit_prior <- function(k, n, delta = 0.5) {
  alpha <- 1 + delta * k
  beta  <- 1 + delta * (n - k)
  list(
    alpha = alpha, beta = beta,
    ess = alpha + beta - 2,
    mean = alpha / (alpha + beta),
    ci = qbeta(c(0.025, 0.975), alpha, beta)
  )
}

# --- BAYESIAN SAMPLE SIZE ---
bayesian_sample_size <- function(
    prior_alpha, prior_beta,
    null_rate, alt_rate,
    threshold = 0.975,
    target_power = 0.80,
    n_sim = 10000
) {
  for (n in seq(20, 500, by = 5)) {
    null_successes <- rbinom(n_sim, n, null_rate)
    null_posts <- pbeta(null_rate,
      prior_alpha + null_successes,
      prior_beta + n - null_successes,
      lower.tail = FALSE)
    type1 <- mean(null_posts >= threshold)

    alt_successes <- rbinom(n_sim, n, alt_rate)
    alt_posts <- pbeta(null_rate,
      prior_alpha + alt_successes,
      prior_beta + n - alt_successes,
      lower.tail = FALSE)
    power <- mean(alt_posts >= threshold)

    if (type1 <= 0.05 && power >= target_power) {
      return(list(n = n, type1 = type1, power = power))
    }
  }
}

# --- PPoS ---
calculate_ppos <- function(
    current_successes, current_n, planned_n,
    prior_alpha, prior_beta,
    null_rate, threshold = 0.975,
    n_sim = 10000
) {
  post_alpha <- prior_alpha + current_successes
  post_beta  <- prior_beta + current_n - current_successes
  remaining  <- planned_n - current_n

  future_p <- rbeta(n_sim, post_alpha, post_beta)
  future_successes <- rbinom(n_sim, remaining, future_p)

  final_successes <- current_successes + future_successes
  final_alpha <- prior_alpha + final_successes
  final_beta  <- prior_beta + planned_n - final_successes

  final_posts <- pbeta(null_rate, final_alpha,
    final_beta, lower.tail = FALSE)
  mean(final_posts >= threshold)
}

# --- USAGE (REBYOTA) ---
prior <- elicit_prior(k = 25, n = 45, delta = 0.5)
cat("Prior: Beta(", prior$alpha, ",", prior$beta, ")\n")

ss <- bayesian_sample_size(
  prior$alpha, prior$beta,
  null_rate = 0.45, alt_rate = 0.65
)
cat("Required N:", ss$n, "\n")

ppos <- calculate_ppos(
  current_successes = 50, current_n = 90,
  planned_n = 180, prior$alpha, prior$beta,
  null_rate = 0.45
)
cat("PPoS:", round(ppos * 100, 1), "%\n")

Python Implementation

Python

import numpy as np
from scipy import stats

def elicit_prior(k: int, n: int, delta: float = 0.5):
    """Construct power prior from historical data."""
    alpha = 1 + delta * k
    beta  = 1 + delta * (n - k)
    return {
        'alpha': alpha, 'beta': beta,
        'ess': alpha + beta - 2,
        'mean': alpha / (alpha + beta),
        'ci': stats.beta.ppf([0.025, 0.975], alpha, beta)
    }

def bayesian_sample_size(
    prior_alpha, prior_beta,
    null_rate, alt_rate,
    threshold=0.975, target_power=0.80, n_sim=10000
):
    for n in range(20, 501, 5):
        null_successes = np.random.binomial(n, null_rate, n_sim)
        null_posts = 1 - stats.beta.cdf(
            null_rate,
            prior_alpha + null_successes,
            prior_beta + n - null_successes
        )
        type1 = np.mean(null_posts >= threshold)

        alt_successes = np.random.binomial(n, alt_rate, n_sim)
        alt_posts = 1 - stats.beta.cdf(
            null_rate,
            prior_alpha + alt_successes,
            prior_beta + n - alt_successes
        )
        power = np.mean(alt_posts >= threshold)

        if type1 <= 0.05 and power >= target_power:
            return {'n': n, 'type1': type1, 'power': power}

def calculate_ppos(
    current_successes, current_n, planned_n,
    prior_alpha, prior_beta,
    null_rate, threshold=0.975, n_sim=10000
):
    post_alpha = prior_alpha + current_successes
    post_beta  = prior_beta + current_n - current_successes
    remaining  = planned_n - current_n

    future_p = np.random.beta(post_alpha, post_beta, n_sim)
    future_successes = np.random.binomial(remaining, future_p)

    final_successes = current_successes + future_successes
    final_alpha = prior_alpha + final_successes
    final_beta  = prior_beta + planned_n - final_successes

    final_posts = 1 - stats.beta.cdf(
        null_rate, final_alpha, final_beta
    )
    return np.mean(final_posts >= threshold)

# --- USAGE ---
prior = elicit_prior(k=25, n=45, delta=0.5)
print(f"Prior: Beta({prior['alpha']}, {prior['beta']})")

ss = bayesian_sample_size(
    prior['alpha'], prior['beta'], 0.45, 0.65
)
print(f"Required N: {ss['n']}, Power: {ss['power']:.2%}")

ppos = calculate_ppos(50, 90, 180,
    prior['alpha'], prior['beta'], 0.45)
print(f"PPoS: {ppos:.1%}")

9. Regulatory Documentation Checklist

Prior Specification (FDA Guidance Section V.D)

Requirement	Documentation
Source of information	Study ID, publication, data cut date
Prior parameters	Distribution family, parameters, ESS
Discounting rationale	Why chosen discount factor is appropriate
Sensitivity analysis	Results under skeptical/moderate/enthusiastic priors

Operating Characteristics (Section IV.A)

Metric	Report	Target
Type I error	Simulated under null	≤ α (one-sided)
Power	Simulated under alternative	≥ 80%
Sample size	N (or events)	—
Decision rule	Explicit threshold	P(benefit) ≥ γ

SAP-Ready Template

Bayesian Primary Analysis: The primary endpoint will be analyzed using a Bayesian model with a [distribution] prior for [parameter], derived from [source] with [discount]% discounting (ESS = [X]).

Decision Rule: The trial will declare success if P([parameter] > [threshold] | Data) ≥ [γ].

Operating Characteristics: Under the null hypothesis ([H₀ specification]), Type I error is [X]%. Under the alternative ([H₁ specification]), power is [Y]%.

Sensitivity Analysis: Primary results will be accompanied by analyses under skeptical (ESS=[a]), moderate (ESS=[b]), and enthusiastic (ESS=[c]) priors.

Interim Analysis: At [information fractions], PPoS will be computed. Futility stopping is recommended if PPoS < 20%. Early efficacy stopping is recommended if PPoS ≥ 90% AND posterior probability ≥ 99%.

10. Quick Reference Cards

Prior Selection Guide

Scenario	Prior Type	ESS
No historical data	Weakly informative	2–4
Single historical study	Power prior (δ=0.3–0.7)	10–30
Multiple studies	MAP prior	Pool
Strong historical match	Commensurate	Adaptive
Regulatory skepticism	Skeptical (at null)	<10

Package Cheat Sheet

Task	R	Python
Prior elicitation	RBesT	scipy.stats
Dose-finding	BOIN	pyboin
Historical borrowing	psborrow2	pymc
Sample size	gsDesign	statsmodels
General Bayesian	rstan, brms	pymc, cmdstanpy

11. References

Regulatory Guidance

FDA (2026). Draft Guidance: Use of Bayesian Methodology in Clinical Trials of Drug and Biological Products.
FDA (2019). Adaptive Design Clinical Trials for Drugs and Biologics: Guidance for Industry.

Case Study Publications

Khanna S, et al. (2022). Efficacy and Safety of RBX2660 in PUNCH CD3. Drugs, 82(15):1527-1538.
Yuan Y, et al. (2016). Bayesian Optimal Interval Design. Clinical Cancer Research, 22:4291-4301.
Wen PY, et al. (2020). GBM AGILE: A Global Adaptive Platform Trial. JCO, 40(16_suppl):TPS2078.
Zhou T, Ji Y. (2024). On Bayesian Sequential Clinical Trial Designs. NEJSDS, 2(1):136-151.

Software

Yan F, et al. (2020). BOIN: An R Package for Dose-Finding Trials. Journal of Statistical Software, 94(13):1-32.
Genentech. psborrow2 Package. genentech.github.io/psborrow2/

Ready to design your trial?

Start with Prior Elicitation and work through the six-step workflow.

Open Bayesian Toolkit Toolkit Documentation

Bayesian Clinical Trial Design: From Prior to Approval

Contents

1. The Regulatory Landscape

FDA's Bayesian Framework

Key Regulatory Programs

Zetyra Toolkit Overview

2. REBYOTA: Binary Endpoints with Historical Borrowing

Background

Step 1: Prior Elicitation

Calculator Inputs

Why δ=0.5\delta = 0.5δ=0.5?

Calculator Output

Step 2: Sample Size with Bayesian Borrowing

Power Curve

Step 3: Two-Arm Comparison (Borrowing vs. Traditional)

3. Oncology Dose-Finding with BOIN

BOIN Boundaries (Target DLT = 30%)

R Implementation

4. Survival Endpoints with Commensurate Priors (DLBCL)

The Commensurate Prior Framework

R Implementation with psborrow2

5. Adaptive Randomization (GBM AGILE)

GBM AGILE Results: Regorafenib Arm

6. Sequential Monitoring

PPoS vs. Sequential Monitoring

Three Bayesian Sequential Approaches

1. Posterior Probability (PP) — Implemented

2. Posterior Predictive Probability (PPP)

3. Decision-Theoretic (DT)

REBYOTA Sequential Design Example

Calculator Inputs

Expected Outputs

7. Predictive Power & Interim Monitoring

PPoS Framework

REBYOTA Interim Example (Hypothetical)

Decision Thresholds by Endpoint Type

8. Implementation: R and Python

Complete R Workflow

Python Implementation

9. Regulatory Documentation Checklist

Prior Specification (FDA Guidance Section V.D)

Operating Characteristics (Section IV.A)

SAP-Ready Template

10. Quick Reference Cards

Prior Selection Guide

Package Cheat Sheet

11. References

Regulatory Guidance

Case Study Publications

Software

Ready to design your trial?

Why $\delta = 0.5$ ?