Docs/Guides/Continuous Outcomes

Sample Size for Continuous Outcomes

Comprehensive power analysis for clinical trials with continuous endpoints (e.g., blood pressure, cholesterol, weight).

1. When to Use This Method

Use this methodology when:

  • Your primary endpoint is measured on a continuous scale (interval or ratio data)
  • You are comparing means between two or more groups
  • You need to power a superiority, non-inferiority, or equivalence trial

Common Applications

Blood pressure reduction (mmHg)
HbA1c change from baseline
Pain scores (VAS)
Quality of life instruments
Biomarker concentrations

Do NOT Use When

  • • Your outcome is binary (use proportions method)
  • • Your outcome is time-to-event (use survival analysis method)
  • • Your outcome is a count or rate (use Poisson method)

2. Mathematical Formulation

2.1 Two-Sample Parallel Design (Superiority)

For a randomized trial comparing treatment (μ1\mu_1) to control (μ2\mu_2), the required sample size per group to detect a clinically meaningful difference Δ=μ1μ2\Delta = \mu_1 - \mu_2:

n=2σ2(z1α/2+z1β)2Δ2n = \frac{2\sigma^2(z_{1-\alpha/2} + z_{1-\beta})^2}{\Delta^2}
SymbolDescription
σ2\sigma^2Common variance (assumed equal across groups)
z1α/2z_{1-\alpha/2}Critical value for Type I error (1.96 for α = 0.05, two-sided)
z1βz_{1-\beta}Critical value for power (0.84 for 80%; 1.28 for 90%)
Δ\DeltaMinimum clinically important difference (MCID)

2.2 Unequal Allocation

For allocation ratio k=n2/n1k = n_2/n_1:

n1=(1+1/k)σ2(z1α/2+z1β)2Δ2n_1 = \frac{(1 + 1/k)\sigma^2(z_{1-\alpha/2} + z_{1-\beta})^2}{\Delta^2}

Note: 1:1 allocation is most efficient. A 2:1 ratio increases total N by ~12%.

2.3 Clustered Designs

When observations are nested within clusters (e.g., patients within clinics), apply the variance inflation factor (design effect):

nclustered=nsimple×[1+(m1)ρ]n_{\text{clustered}} = n_{\text{simple}} \times [1 + (m-1)\rho]
mmAverage cluster size
ρ\rhoIntraclass correlation coefficient (ICC)

2.4 Repeated Measures / Longitudinal

For comparing slopes or change from baseline with mm measurements per subject and correlation ρ\rho between measurements:

nrepeated=nsimplem×11+(m1)ρn_{\text{repeated}} = \frac{n_{\text{simple}}}{m} \times \frac{1}{1 + (m-1)\rho}

Note: Benefits plateau quickly—increasing beyond 4-5 measurements yields diminishing returns.

2.5 Dropout Adjustment

Inflate sample size to account for anticipated dropout:

N=N(1d)2N^* = \frac{N}{(1 - d)^2}

Where dd = expected dropout rate.

3. Assumptions

3.1 Core Assumptions

AssumptionTestable CriterionViolation Consequence
NormalityShapiro-Wilk p > 0.05; Q-Q plot linearityModerate: CLT protects with n > 30/group
Equal variancesLevene's test p > 0.05; ratio of SDs < 2Use Welch's t-test or Satterthwaite df
IndependenceStudy design ensures no clusteringSevere: inflated Type I error if ignored
MCID validityLiterature/clinical consensus supports ΔUnderpowered if Δ too optimistic

3.2 Parameter Estimates

Variance (σ2\sigma^2)

Should come from prior studies, pilot data, or published literature in similar populations. If uncertain, conduct sensitivity analysis across plausible range.

Effect size (Δ\Delta)

Must be clinically meaningful, not just statistically detectable. Overly optimistic effect sizes are the #1 cause of underpowered trials.

4. Regulatory Guidance

FDA

ICH E9 (Statistical Principles for Clinical Trials)

"The number of subjects...should always be large enough to provide a reliable answer to the questions addressed." Requires justification of effect size and variance assumptions.

FDA Guidance on Adaptive Designs (2019)

Permits sample size re-estimation based on interim variance, but effect size must remain blinded.

EMA

CHMP Points to Consider on Adjustment for Baseline Covariates

Recommends ANCOVA for continuous outcomes, which can reduce variance and required sample size.

EMA Guideline on Missing Data (2010)

Requires sensitivity analyses for missing data; dropout adjustment should be pre-specified.

Key Citations

  1. ICH E9: Statistical Principles for Clinical Trials (1998)
  2. FDA Guidance: Adaptive Designs for Clinical Trials of Drugs and Biologics (2019)
  3. EMA: Guideline on Adjustment for Baseline Covariates in Clinical Trials (2015)

5. Validation Against Industry Standards

ScenarioParametersPASS 2024nQuery 9.5ZetyraStatus
Two-sample t-testα=0.05, power=0.80, Δ=5, σ=1064/group64/group64/group✓ Match
Two-sample t-testα=0.05, power=0.90, Δ=5, σ=1086/group86/group86/group✓ Match
Unequal allocation (2:1)α=0.05, power=0.80, Δ=5, σ=1048/9648/9648/96✓ Match
Cluster RCTICC=0.05, m=20, Δ=5, σ=10127/group127/group127/group✓ Match

Minor variations (±1 subject) may occur due to rounding conventions.

6. Example SAP Language

Sample Size Justification

The primary endpoint is change from baseline in [outcome] at Week [X]. Based on prior studies (Author et al., Year), we assume a standard deviation of [σ] units. A difference of [Δ] units is considered the minimum clinically important difference based on [justification].

Using a two-sample t-test with a two-sided significance level of 0.05 and 80% power, [n] subjects per group are required. To account for an anticipated dropout rate of [X]%, we will enroll [N*] subjects per group ([total] subjects total).

Calculations were performed using [Zetyra / gsDesign / PASS] and validated against [reference software].

7. R Code

# Two-sample t-test sample size
power.t.test(
  delta = 5,        # MCID
  sd = 10,          # Standard deviation
  sig.level = 0.05, # Alpha (two-sided)
  power = 0.80,     # 1 - beta
  type = "two.sample",
  alternative = "two.sided"
)
# Result: n = 64 per group

# With unequal allocation (2:1)
# Using pwr package
library(pwr)
pwr.t2n.test(
  d = 5/10,         # Cohen's d = delta/sd
  n1 = NULL,
  n2 = NULL,
  sig.level = 0.05,
  power = 0.80,
  alternative = "two.sided"
)

# Cluster RCT adjustment
n_simple <- 64
m <- 20          # cluster size
icc <- 0.05      # intraclass correlation
deff <- 1 + (m - 1) * icc  # design effect = 1.95
n_cluster <- ceiling(n_simple * deff)
# Result: n = 125 per group (rounded up)

# Dropout adjustment
dropout_rate <- 0.15
n_adjusted <- ceiling(n_cluster / (1 - dropout_rate)^2)
# Result: n = 173 per group

References

  1. Chow SC, Shao J, Wang H, Lokhnygina Y. Sample Size Calculations in Clinical Research. 3rd ed. CRC Press; 2017.
  2. Julious SA. Sample Sizes for Clinical Trials. CRC Press; 2010.
  3. Donner A, Klar N. Design and Analysis of Cluster Randomization Trials in Health Research. Wiley; 2000.
  4. Diggle PJ, Heagerty P, Liang KY, Zeger SL. Analysis of Longitudinal Data. 2nd ed. Oxford University Press; 2002.

Last updated: December 2025 | Validated against PASS 2025, nQuery 9.5

Ready to calculate your sample size?

Use our Sample Size Calculator to quickly determine the number of subjects needed for your clinical trial.

Open Sample Size Calculator