Technical White Paper

Zetyra: A Validated Suite of Statistical Calculators for Efficient Clinical Trial Design

Comprehensive technical documentation for biostatisticians evaluating clinical trial design software.

Version 1.0January 2026Lu (Maggie) Qian, MS

Key Findings

51 Automated Tests
All passing against industry gold standards
0.0046 z-score
Maximum deviation from gsDesign R package
15-35% Reduction
Typical sample size savings with CUPED
FDA/EMA Aligned
Regulatory guidance citations embedded

1Executive Summary

FDA Published Major Guidance (January 12, 2026)

FDA released draft guidance extending Bayesian methodology to drugs and biologics.

“Bayesian methodologies help address two of the biggest problems of drug development: high costs and long timelines.”
— FDA Commissioner Marty Makary

This white paper demonstrates exactly this value proposition through validated calculators and quantified case studies.

The Challenge

Phase III clinical trials in oncology and cardiology average $50-100 million and require 4-6 years from first patient to database lock. Conservative statistical designs—failing to leverage baseline covariates, fixed-sample approaches without interim monitoring, and frequentist paradigms for Phase II decisions—often inflate sample sizes by 15-35% relative to efficient alternatives.

Three proven methodologies can substantially reduce trial costs and duration, but existing software packages are expensive ($5,000-$15,000 annually), complex to deploy, and lack transparent validation.

The Solution

Zetyra is a web-based platform offering three validated statistical calculators that enable efficient clinical trial design:

  • Comprehensive methodology: Integrates CUPED, group sequential, and Bayesian methods in single platform
  • Affordable: Monthly subscription vs. $5K-$15K perpetual licenses
  • Transparent validation: Public validation suite (51 automated tests) vs. proprietary validation approaches
  • Accessible: Web-based interface vs. IT department installation requirements
  • Accurate: Maximum deviation 0.0046 z-score vs. pre-specified acceptance criterion of ±0.05

Table 1

Key Validation Results

CalculatorTestsMax DeviationReference
Group Sequential Design30 passed0.0046 z-scoregsDesign R package
CUPED12 passedExact matchAnalytical VRF formula
Bayesian9 passedExact matchConjugate prior solutions

Business Impact (Representative Examples)

A Phase II oncology trial (240 patients standard design) can potentially be reduced to:

168 patients
with CUPED (30% reduction, ρ=0.55)
20-40%
expected reduction with GSD under H₁
Improved
go/no-go decisions with Bayesian monitoring

2Introduction

2.1 The Clinical Trial Efficiency Problem

Clinical development represents one of the most capital-intensive endeavors in modern medicine. DiMasi et al. (2016) estimated the capitalized cost to bring a single drug from discovery through FDA approval at $2.6 billion, with Phase II and Phase III trials accounting for approximately 60% of total development costs.

Conservative design practices systematically inflate these already-substantial costs. Three common inefficiencies dominate:

1

Failure to leverage baseline covariates

Standard power calculations ignore correlations (ρ = 0.4-0.7 typical for many endpoints) between baseline measurements and treatment outcomes. For continuous outcomes, failing to adjust for baseline covariates inflates sample sizes by a factor of 1/(1-ρ²), yielding 15-35% overestimation when ρ ranges from 0.4 to 0.6.

2

Fixed-sample designs despite interim data

Most trials continue to planned completion despite accumulating interim evidence of efficacy or futility. Group sequential designs with pre-specified stopping boundaries can reduce expected sample size under the alternative hypothesis by 15-30% (O'Brien-Fleming) to 30-40% (Pocock).

3

Frequentist paradigm for Phase II decisions

Traditional hypothesis tests provide binary answers (p<0.05 or not) without quantifying the probability of Phase III success. Bayesian predictive probability frameworks enable more nuanced decisions with quantitative risk assessment.

2.2 Existing Software Limitations

Table 2

Software Limitations and Impact

LimitationImpact on Adoption
High cost: $5,000-$15,000/yearSmall biotechs (Series A/B) priced out
IT barriers: Desktop installation, version controlRequires IT department involvement, delays adoption
Limited scope: Separate tools for each methodologyUsers must purchase multiple products, learn different interfaces
Opaque validation: No published accuracy benchmarks"Trust us" model inappropriate for regulatory submissions
Poor documentation: Sparse regulatory guidance citationsAdditional work required for FDA/EMA submissions

2.3 Zetyra Platform Overview

Zetyra addresses these inefficiencies through three integrated, validated statistical calculators:

CUPED

Calculates sample size reduction from baseline covariate adjustment. Variance reduction factor (1-ρ²), adjusted sample size, expected power gain.

Group Sequential Design

Calculates stopping boundaries for interim analyses. O'Brien-Fleming, Pocock, and alpha-spending function boundaries with sample sizes at each look.

Bayesian Predictive Power

Calculates probability of trial success given interim data. Beta-binomial (binary) and normal-normal (continuous) models with futility/graduation thresholds.

3CUPED: Covariate-Adjusted Power Analysis

3.1 Theoretical Foundation

CUPED (Controlled-experiment Using Pre-Experiment Data) is a variance reduction technique that leverages baseline covariates to improve statistical power. Originally developed by Microsoft Research (Deng et al., 2013) for online A/B testing, CUPED has proven applications in clinical trial design.

The key insight is that if a baseline measurement X is correlated with the outcome Y, incorporating X into the analysis reduces unexplained variance and increases statistical power. This is mathematically equivalent to ANCOVA.

3.2 Mathematical Framework

Variance Reduction Factor

VRF = 1 - ρ²

where ρ is the Pearson correlation between baseline covariate X and outcome Y.

Sample Size Adjustment

nCUPED = nstandard × (1 - ρ²)

Sample size decreases proportionally to variance reduction.

Variance Reduction by Correlation

Table 3

VRF and Sample Size Reduction by Correlation

Correlation (ρ)VRFVariance ReductionSample Size Reduction
0.01.000%0%
0.30.919%9%
0.50.7525%25%
0.60.6436%36%
0.70.5149%49%
0.90.1981%81%

3.3 Regulatory Considerations

FDA Guidance (May 2023)

“FDA encourages sponsors to consider covariate adjustment as a way to improve the precision of treatment effect estimates and increase statistical power.”

The FDA released updated guidance explicitly encouraging covariate adjustment as "low-hanging fruit" to improve trial efficiency. Covariate adjustment should be pre-specified in the statistical analysis plan before database lock and unblinding.

3.4 Benchmark Correlations (Walters et al., 2019)

Analysis of 464 correlations from 20 UK Health Technology Assessment trials:

ρ = 0.50
Mean correlation (median 0.51, SD 0.15)
Depression scales (PHQ-9):ρ = 0.66
Physical functioning (SF-36):ρ = 0.64
Quality of life (EQ-5D):ρ = 0.55
Pain scales (VAS):ρ = 0.41

4Group Sequential Design

4.1 Theoretical Foundation

Group Sequential Designs (GSD) allow pre-planned interim analyses during clinical trials while maintaining overall Type I error control. This adaptive approach enables early termination for efficacy (if treatment effect is compelling) or futility (if success appears unlikely), substantially reducing expected trial duration and sample size.

Type I Error Without Adjustment

Table 4

Type I Error Inflation Without Multiple Testing Adjustment

Number of Looks (K)Naive α = 0.05 per lookTrue Type I Error
10.050.050
20.050.083
30.050.106
50.050.141

Group sequential methods achieve error control through carefully calibrated critical values at each analysis using alpha-spending functions.

4.2 Alpha-Spending Functions

O'Brien-Fleming

Conservative early boundaries that preserve final analysis power. Most commonly used in confirmatory trials.

Very high thresholds early (Z = 4.56 at 20% info), approaches 1.96 at end.

Pocock

Constant boundaries across analyses. More aggressive early stopping but requires larger sample size.

Equal alpha allocation, constant Z ≈ 2.41 for K=5.

Sample Size Inflation Factor

Table 5

Sample Size Inflation by Boundary Type

Boundary TypeK=2K=3K=4K=5
O'Brien-Fleming1.011.021.021.03
Pocock1.101.141.161.17

4.3 Example: HPTN 083 Trial

The HPTN 083 HIV prevention trial (Landovitz et al., NEJM 2021) used a 4-look O'Brien-Fleming design:

AnalysisEventsZ-boundaryHR Boundary
Look 1 (25%)444.3330.39
Look 2 (50%)882.9630.66
Look 3 (75%)1322.3590.82
Look 4 (100%)1761.9930.91

Outcome: Trial stopped at Look 1 with observed HR = 0.29, crossing the efficacy boundary. FDA approved cabotegravir for PrEP in December 2021.

5Bayesian Predictive Power

5.1 Theoretical Foundation

Bayesian Predictive Probability of Success (PPoS) provides a framework for interim decision-making by computing the probability that a trial will succeed at its final analysis, given accumulated interim data and prior beliefs. Unlike frequentist conditional power (which conditions on a fixed parameter value), Bayesian predictive power integrates over the posterior distribution, properly accounting for parameter uncertainty.

Conditional Power vs. Predictive Power

Table 6

Comparison of Frequentist and Bayesian Approaches

AspectConditional (Frequentist)Predictive (Bayesian)
Parameter TreatmentFixed at specific value θ*Distribution π(θ|data)
UncertaintyIgnores parameter uncertaintyFully accounts for uncertainty
Interpretation"If true effect is θ*, probability of success""Given what we know now, probability of success"

5.2 Conjugate Prior Families

Beta-Binomial (Binary)

For response rates, success/failure outcomes

Posterior: Beta(α₀ + x, β₀ + n - x)

Normal-Normal (Continuous)

For mean changes, biomarkers

Posterior: Precision-weighted mean

5.3 Decision Framework

Table 7

PPoS Decision Thresholds

PPoS RangeRecommendationRationale
< 10%Stop for futility<10% chance of success; avoid wasting resources
10-30%Borderline; re-evaluateConsider design modifications, biomarker refinement
30-50%Continue with cautionMay proceed if unmet medical need high
> 50%Proceed to Phase III>50% success probability justifies investment
> 85%Consider early graduationVery promising; I-SPY 2 uses 85% threshold

FDA Draft Guidance (January 12, 2026)

“Bayesian methodologies help address two of the biggest problems of drug development: high costs and long timelines.”

— FDA Commissioner Marty Makary, on the draft guidance extending Bayesian methodology to drugs and biologics

6Validation Framework

6.1 Overview and Methodology

Zetyra calculators undergo comprehensive external validation through three complementary approaches: (1) software benchmarking against established reference implementations, (2) analytical formula verification using closed-form solutions, and (3) published clinical trial replication.

Open Source Validation

All validation code, test data, and results are publicly available at github.com/evidenceinthewild/zetyra-validation under MIT license. This enables independent verification, continuous validation via GitHub Actions, and community contribution.

View Validation Repository

Validation Summary

Table 8

Validation Results Summary

CalculatorTestsStatusMax DeviationReference
GSD30✓ 100%0.0046 z-scoregsDesign R package
CUPED12✓ 100%ExactAnalytical VRF = 1-ρ²
Bayesian9✓ 100%ExactConjugate prior formulas
Total51100%0.0046Multiple benchmarks

6.2 Clinical Trial Replications

HPTN 083

Phase 3 HIV prevention trial. 4-look O'Brien-Fleming design replicated within 0.000 z-score deviation.

✓ All boundaries exact match

HeartMate II

LVAD trial with unequal information fractions [0.27, 0.67, 1.00]. All boundary properties verified.

✓ Unequal spacing validated

7Case Studies

These case studies represent realistic scenarios constructed from published trial parameters and literature-supported assumptions. Actual benefits vary by trial characteristics.

7.1

Oncology Phase II: Sample Size Reduction via CUPED

HER2-positive breast cancer trial. Baseline tumor burden (SLD) correlation ρ = 0.55 with response.

MetricStandardCUPEDSavings
Sample Size24016872 (30%)
Duration14 mo10.4 mo3.6 mo
Total Cost$12.0M$8.4M$3.6M
7.2

Cardiovascular Phase III: Early Stopping with GSD

PCSK9 inhibitor for MACE prevention. 4-look O'Brien-Fleming design with 2,400 patients.

MetricFixedGSD (Stopped)Savings
Duration48 mo36 mo12 mo (25%)
Events430220210 events
Cost$76.8M$58.7M$18.1M

Outcome: Stopped at Interim 2 (HR = 0.68). FDA Priority Review granted, 12 months earlier approval.

7.3

Rare Disease Trial: Bayesian Go/No-Go Decision

Gene therapy for Duchenne muscular dystrophy. N = 30 patients (limited by prevalence).

Interim (n=20)
PPoS = 42%
Continue enrollment
Final (N=30)
P(Δ6MWD > 30m) = 78%
Proceed to pivotal

Outcome: Breakthrough Therapy Designation granted. Accelerated Approval obtained 18 months earlier than traditional pathway.

7.4

Comparative Analysis: Full Program Integration

NSCLC immunotherapy Phase II/III program with all three methodologies integrated.

MetricTraditionalZetyra-OptimizedSavings
Total Duration66 months50 months16 mo (24%)
Total Cost$104M$89.9M$14.1M (14%)
Time to BLAMonth 72Month 5616 mo earlier

8Conclusions

8.1 Summary of Capabilities

Zetyra provides a validated, integrated platform of three statistical calculators addressing complementary inefficiencies in clinical trial design:

CUPED

  • • Leverages baseline-outcome correlations to reduce sample size by 15-35%
  • • Validated against analytical VRF formula with exact matches
  • • Supported by FDA May 2023 guidance on covariate adjustment

Group Sequential Design

  • • Enables interim efficacy/futility monitoring with Type I error control
  • • Validated against gsDesign (max deviation 0.0046 z-score)
  • • O'Brien-Fleming requires only 2-3% inflation, enables 15-40% expected reduction

Bayesian Predictive Power

  • • Computes probability of success given interim data and prior beliefs
  • • Validated against analytical conjugate prior formulas (exact matches)
  • • Enables quantitative go/no-go decisions vs. binary p-value thresholds

8.2 Competitive Positioning

Table 9

Feature Comparison

CapabilityZetyraEastPASSnQuery
CUPED Calculator
Group Sequential
Bayesian Predictive
Public Validation
Web-Based
Annual Cost$1,188$15,000$8,000$6,000

The future of clinical trial design is transparent, validated, accessible, and efficient.

As regulatory agencies increasingly encourage efficient designs, methodologies like covariate adjustment, group sequential monitoring, and Bayesian predictive power will transition from competitive advantage to industry standard.

9References

Statistical Methodology

1. Deng A, Xu Y, Kohavi R, Walker T. Improving the sensitivity of online controlled experiments by utilizing pre-experiment data. WSDM 2013.

2. Frison L, Pocock SJ. Repeated measures in clinical trials. Statistics in Medicine 1992.

3. O'Brien PC, Fleming TR. A multiple testing procedure for clinical trials. Biometrics 1979.

4. Lan KKG, DeMets DL. Discrete sequential boundaries for clinical trials. Biometrika 1983.

5. Jennison C, Turnbull BW. Group Sequential Methods with Applications to Clinical Trials. Chapman & Hall, 2000.

6. Berry SM, et al. Bayesian Adaptive Methods for Clinical Trials. CRC Press, 2010.

7. Spiegelhalter DJ, Freedman LS. Monitoring clinical trials: conditional or predictive power? Controlled Clinical Trials 1986.

Empirical Studies

8. Walters SJ, et al. Sample size estimation for RCTs with repeated assessment of PROs. Trials 2019. [464 correlations, mean ρ=0.50]

9. DiMasi JA, et al. Innovation in the pharmaceutical industry: New estimates of R&D costs. J Health Economics 2016. [$2.6B average]

10. Moore TJ, et al. Estimated costs of pivotal trials. JAMA Internal Medicine 2018.

Regulatory Guidance

11. FDA. Adjusting for Covariates in Randomized Clinical Trials. May 2023.

12. FDA. Adaptive Designs for Clinical Trials of Drugs and Biologics. November 2019.

13. FDA. Use of Bayesian Methodology in Clinical Trials (Draft). January 12, 2026.

14. EMA. Guideline on Adjustment for Baseline Covariates. EMA/CHMP/295050/2013.

15. ICH E9(R1). Estimands and Sensitivity Analysis. November 2019.

Published Clinical Trials

16. Landovitz RJ, et al. Cabotegravir for HIV prevention. NEJM 2021. [HPTN 083]

17. Slaughter MS, et al. HeartMate II LVAD trial. NEJM 2009.

18. Barker AD, et al. I-SPY 2: Adaptive breast cancer trial design. Clin Pharmacol Ther 2009.

19. Maude SL, et al. Tisagenlecleucel in pediatric ALL. NEJM 2018.

Software

20. Anderson K. gsDesign: Group Sequential Design. R package v3.6.4. 2024.

Full 60-reference bibliography available in PDF version, including foundational statistical works, computational methods, and additional regulatory documents.

10Appendices

Appendix A: API Documentation

Zetyra provides a RESTful API for programmatic access to all calculators.

POST https://zetyra-backend.../api/v1/cuped
POST https://zetyra-backend.../api/v1/gsd
POST https://zetyra-backend.../api/v1/bayesian/binary

Full API documentation with Python/R client examples available in PDF version.

Appendix B: Validation Test Results

Complete validation results available at GitHub repository:

github.com/evidenceinthewild/zetyra-validation/results

Appendix C: Regulatory Guidance Quick Reference

CUPED: FDA-2023-D-1711 (May 2023), EMA/CHMP/295050/2013

GSD: FDA-2018-D-3124 (November 2019), CHMP/EWP/2459/02

Bayesian: FDA-2024-D-5829 (Draft, January 12, 2026)

Estimands: ICH E9(R1) (November 2019)

Appendix D: Glossary

Alpha (α): Type I error rate
VRF: Variance Reduction Factor (1-ρ²)
GSD: Group Sequential Design
PPoS: Predictive Probability of Success
DSMB: Data Safety Monitoring Board
Conjugate Prior: Prior yielding same-family posterior

Full glossary with 30+ terms available in PDF version.

Appendix E: Platform Architecture

Frontend

React 18 + TypeScript, Tailwind CSS, Recharts

Backend

Python FastAPI, NumPy/SciPy, gsDesign via rpy2

Infrastructure

Google Cloud Run, PostgreSQL, 99.9% SLA

Security

OAuth 2.0, TLS 1.3, AES-256, SOC 2 in progress

Suggested Citation

Qian, Lu. (2026). Zetyra: A Validated Suite of Statistical Calculators for Efficient Clinical Trial Design (Version 1.0). Zenodo. https://doi.org/10.5281/zenodo.18253308

DOI: 10.5281/zenodo.18253308

Ready to design more efficient trials?

Try Zetyra's validated calculators today.