Docs/Guides/Longitudinal & Repeated Measures

Sample Size for Longitudinal and Repeated Measures

Comprehensive power analysis for clinical trials with multiple measurements per subject over time.

1. When to Use This Method

Use this methodology when:

  • Each subject is measured multiple times at pre-specified intervals
  • You want to compare trajectories (slopes, rates of change) between groups
  • You want to compare endpoints while adjusting for baseline
  • You need to model within-subject correlation explicitly

Common Designs

DesignDescriptionUse Case
Parallel longitudinalGroups randomized, measured over timeMost clinical trials
CrossoverEach subject receives all treatmentsChronic conditions, washout possible
Pre-postBaseline + one follow-upSimple intervention studies
N-of-1Single patient, multiple treatment periodsPersonalized medicine
Interrupted time seriesSingle unit, multiple observations pre/postPolicy evaluation

Common Applications

  • Disease progression trials (Alzheimer's, MS, ALS)
  • Growth studies (pediatric development)
  • Pharmacokinetic studies (drug concentration over time)
  • Chronic disease management (diabetes, hypertension)
  • Behavioral interventions (weight loss, smoking cessation)
  • Quality of life trajectories

Contraindications

  • Single measurement per subject — use standard methods
  • Time-to-event outcome with censoring — use survival analysis
  • Cluster-level randomization — use cluster methods (though can combine)
  • No meaningful time structure to measurements

2. Mathematical Formulation

2.1 Comparison Types

ComparisonQuestion AnsweredStatistical Approach
Endpoint meansAre final values different?ANCOVA, MMRM
Change from baselineIs the change different?Paired t-test, ANCOVA
SlopesAre rates of change different?Mixed models, GEE
AUCIs total exposure different?Summary statistic
Time × TreatmentDo trajectories diverge?Mixed models

2.2 Sample Size for Comparing Slopes

For comparing rates of change between two groups with mm equally-spaced measurements:

n=2σ2(z1α/2+z1β)2(β1β0)2×Vtn = \frac{2\sigma^2(z_{1-\alpha/2} + z_{1-\beta})^2}{(\beta_1 - \beta_0)^2 \times V_t}

β₁ - β₀ = difference in slopes between groups, V_t = variance of time points

For equally-spaced times t=0,1,...,m1t = 0, 1, ..., m-1:

Vt=m(m21)12V_t = \frac{m(m^2-1)}{12}

Variance of time points for equally-spaced design

2.3 Sample Size for Comparing Endpoint Means (ANCOVA)

When adjusting for baseline in a pre-post design:

n=2σ2(1ρ2)(z1α/2+z1β)2Δ2n = \frac{2\sigma^2(1-\rho^2)(z_{1-\alpha/2} + z_{1-\beta})^2}{\Delta^2}

ρ = correlation between baseline and endpoint, (1-ρ²) = variance reduction from ANCOVA

Efficiency Gain

If ρ = 0.5, ANCOVA reduces required n by 25%. If ρ = 0.7, reduction is 51%.

2.4 Sample Size for Change Scores

For comparing mean change from baseline:

n=4σ2(1ρ)(z1α/2+z1β)2Δ2n = \frac{4\sigma^2(1-\rho)(z_{1-\alpha/2} + z_{1-\beta})^2}{\Delta^2}

ρ = correlation between baseline and follow-up

2.5 General Repeated Measures Formula

For mm measurements with compound symmetry correlation ρ\rho:

n=2σ2(z1α/2+z1β)2Δ2×1+(m1)ρmn = \frac{2\sigma^2(z_{1-\alpha/2} + z_{1-\beta})^2}{\Delta^2} \times \frac{1 + (m-1)\rho}{m}

The factor [1 + (m-1)ρ]/m represents efficiency relative to a single measurement

Efficiency Factor by Number of Measurements

mρ = 0.3ρ = 0.5ρ = 0.7
20.650.750.85
30.530.670.80
40.480.630.78
50.440.600.76
100.370.550.73

2.6 Correlation Structures

StructureAssumptionFormula
Compound Symmetry (CS)All pairs equally correlatedCorr(Yi,Yj)=ρCorr(Y_i, Y_j) = \rho for all i ≠ j
AR(1)Correlation decays with time lagCorr(Yi,Yj)=ρijCorr(Y_i, Y_j) = \rho^{|i-j|}
UnstructuredNo pattern assumedEach pair estimated separately
ToeplitzCorrelation depends on lag onlyCorr(Yi,Yj)=ρijCorr(Y_i, Y_j) = \rho_{|i-j|}

Impact on sample size: AR(1) provides less benefit from additional measurements than CS because distant observations are less correlated.

2.7 Crossover Design

For a 2×2 crossover comparing two treatments:

n=2σw2(z1α/2+z1β)2Δ2n = \frac{2\sigma^2_w(z_{1-\alpha/2} + z_{1-\beta})^2}{\Delta^2}

σ²_w = within-subject variance (typically much smaller than between-subject variance)

Efficiency

Crossover designs require roughly 12(1ρ)\frac{1}{2(1-\rho)} as many subjects as parallel designs, where ρ = correlation between periods.

2.8 Missing Data Adjustment

Simple Inflation (MCAR assumption):

nadjusted=n(1d)kn_{adjusted} = \frac{n}{(1-d)^k}

d = per-visit dropout rate, k = power (typically 1 for conservative, 2 for optimistic)

Pattern Mixture Approach:

For monotone dropout with expected completion rate pcp_c:

nadjusted=n×1pc+(1pc)×rinfon_{adjusted} = n \times \frac{1}{p_c + (1-p_c) \times r_{info}}

r_info = information retained from incomplete observations (depends on analysis method)

2.9 Diminishing Returns of Additional Measurements

Measurements (m)Relative efficiency (ρ=0.5)Marginal gain
11.00
20.7525%
30.678%
40.634%
50.603%
100.55<1% each

Rule of Thumb

Beyond 4-5 measurements, additional timepoints yield minimal sample size reduction under compound symmetry correlation.

3. Assumptions

3.1 Core Assumptions

AssumptionTestable CriterionViolation Consequence
Correct correlation structureAIC/BIC model comparisonInefficient estimates; incorrect SEs
Missing at Random (MAR)Untestable; sensitivity analysisBiased estimates if MNAR
Linear trajectories (for slope comparison)Residual plots, polynomial termsWrong functional form invalidates slope comparison
No carryover (crossover)Washout period adequate; test for period × treatmentBiased treatment effect
Homogeneous varianceResidual plots by group/timeUse heterogeneous variance models

3.2 Missing Data Mechanisms

MechanismDefinitionExampleAnalysis Implication
MCARMissingness unrelated to any dataRandom administrative errorComplete case valid
MARMissingness related to observed data onlySicker patients miss visits but sickness is measuredMixed models, multiple imputation valid
MNARMissingness related to unobserved outcomeDropout due to lack of efficacySensitivity analysis required

3.3 Correlation Magnitude Guidelines

SettingTypical ρ Range
Short-term (days to weeks)0.7 – 0.9
Medium-term (weeks to months)0.5 – 0.7
Long-term (months to years)0.3 – 0.5
Highly stable traits0.8 – 0.95
Highly variable measures0.2 – 0.4

4. Regulatory Guidance

FDA

  • ICH E9 (Statistical Principles): Recommends mixed-effects models for repeated measures (MMRM) as primary analysis for longitudinal trials. Emphasizes handling of missing data.
  • FDA Guidance on Missing Data in Clinical Trials (2010):
    • Discourages LOCF (Last Observation Carried Forward)
    • Recommends likelihood-based methods (MMRM) or multiple imputation
    • Requires sensitivity analyses under different missing data assumptions
    • Primary estimand should be clearly defined
  • ICH E9(R1) Addendum on Estimands (2019): Define how intercurrent events (dropout, treatment switching) are handled. Composite, hypothetical, treatment policy, and principal stratum strategies.

EMA

  • CHMP Guideline on Missing Data (2010): MAR is often a reasonable assumption but must be justified. Sensitivity analyses required for departures from MAR. Pattern mixture and selection models recommended for MNAR scenarios.
  • CHMP Points to Consider on ANCOVA: ANCOVA with baseline as covariate preferred for pre-post designs. More powerful than change score analysis or post-only comparison.

Key Analysis Requirements

  1. Pre-specify correlation structure or state that unstructured will be used
  2. Define primary estimand including handling of missing data
  3. Sensitivity analyses for missing data mechanism assumptions
  4. Avoid LOCF as primary analysis (acceptable only as sensitivity)

Key Citations

  1. ICH E9: Statistical Principles for Clinical Trials (1998)
  2. ICH E9(R1): Addendum on Estimands and Sensitivity Analysis (2019)
  3. FDA Guidance: Missing Data in Confirmatory Clinical Trials (2010)
  4. NRC Report: Prevention and Treatment of Missing Data in Clinical Trials (2010)
  5. CHMP: Guideline on Missing Data in Confirmatory Clinical Trials (2010)

5. Validation Against Industry Standards

ANCOVA (Pre-Post Design)

ScenarioParametersPASS 2024nQuery 9.5Zetyra
ANCOVAΔ=5, σ=10, ρ=0.5, α=0.05, power=0.8048/group48/group48/group ✓
ANCOVAΔ=5, σ=10, ρ=0.7, α=0.05, power=0.8033/group33/group33/group ✓
No baseline adj.Δ=5, σ=10, α=0.05, power=0.8064/group64/group64/group ✓

Repeated Measures (Compound Symmetry)

ScenarioParametersPASS 2024nQuery 9.5Zetyra
3 timepointsΔ=5, σ=10, ρ=0.5, α=0.05, power=0.8043/group43/group43/group ✓
5 timepointsΔ=5, σ=10, ρ=0.5, α=0.05, power=0.8039/group38/group39/group ✓
3 timepointsΔ=5, σ=10, ρ=0.7, α=0.05, power=0.8051/group51/group51/group ✓

Slope Comparison

ScenarioParametersPASS 2024nQuery 9.5Zetyra
4 timepointsΔβ=2, σ=10, times=0,1,2,3100/group100/group100/group ✓
6 timepointsΔβ=2, σ=10, times=0,1,...,535/group35/group35/group ✓

Crossover Design

ScenarioParametersPASS 2024nQuery 9.5Zetyra
2×2 crossoverΔ=5, σ_w=6, α=0.05, power=0.8012 total12 total12 total ✓
2×2 crossoverΔ=5, σ_w=8, α=0.05, power=0.8022 total22 total22 total ✓

Minor variations may occur due to rounding and formula variants.

6. Example SAP Language

Parallel Longitudinal Trial (MMRM)

STATISTICAL ANALYSIS PLAN TEMPLATE
Sample Size Justification The primary endpoint is change from baseline in [outcome] at Week [X]. Subjects will be assessed at baseline and Weeks [list]. Based on prior studies (Author et al., Year), the standard deviation of [outcome] is [σ] and the correlation between repeated measurements is approximately [ρ] (compound symmetry assumed). We hypothesize a difference of [Δ] units between treatment and control at Week [X]. Using a mixed-model repeated measures (MMRM) analysis with [m] timepoints, α = 0.05 (two-sided), and 80% power, [n] subjects per group are required. Accounting for [X]% dropout by Week [X], we will enroll [N*] subjects per group. Analysis: The primary analysis will use MMRM with treatment, visit, treatment-by-visit interaction, and baseline value as fixed effects. An unstructured covariance matrix will model within-subject correlation. Kenward-Roger degrees of freedom will be used.

ANCOVA (Pre-Post Design)

STATISTICAL ANALYSIS PLAN TEMPLATE
Sample Size Justification The primary endpoint is [outcome] at Week [X], adjusted for baseline. Based on prior data, the correlation between baseline and Week [X] values is ρ = [value], and the standard deviation is [σ]. To detect a difference of [Δ] between groups with α = 0.05 and 80% power, ANCOVA requires [n] subjects per group. This represents a [Y]% reduction compared to unadjusted comparison due to the variance reduction from baseline adjustment. Analysis: ANCOVA with treatment as factor and baseline as covariate.

Slope Comparison (Disease Progression)

STATISTICAL ANALYSIS PLAN TEMPLATE
Sample Size Justification The primary endpoint is the rate of change (slope) in [outcome] over [duration]. Subjects will be assessed at [timepoints]. Based on natural history studies (Author et al., Year), the control group is expected to decline at [β₀] units per [time unit]. We hypothesize the treatment will reduce the rate of decline to [β₁] units per [time unit], a difference of [Δβ]. Using a linear mixed-effects model with random intercepts and slopes, residual SD of [σ], and [m] equally-spaced measurements, [n] subjects per group are required for 80% power at α = 0.05. Analysis: Linear mixed model with fixed effects for treatment, time, treatment-by-time interaction, and random subject-specific intercepts and slopes.

Crossover Trial

STATISTICAL ANALYSIS PLAN TEMPLATE
Sample Size Justification This is a 2×2 crossover trial comparing [treatment A] to [treatment B]. Each subject will receive both treatments in random order with a [duration] washout period. The primary endpoint is [outcome] measured at the end of each treatment period. Based on prior crossover studies (Author et al., Year), the within-subject standard deviation is [σ_w]. To detect a difference of [Δ] between treatments with α = 0.05 and 80% power, [n] subjects are required to complete both periods. Accounting for [X]% dropout, we will enroll [N*] subjects. Analysis: Mixed model with treatment, period, and sequence as fixed effects and subject nested within sequence as random effect.

7. R Code

ANCOVA (Pre-Post Design)

R
ancova_sample_size <- function(delta, sigma, rho, alpha = 0.05, power = 0.80) {
  z_alpha <- qnorm(1 - alpha/2)
  z_beta <- qnorm(power)

  # Variance reduction from ANCOVA
  var_reduction <- 1 - rho^2

  n <- 2 * sigma^2 * var_reduction * (z_alpha + z_beta)^2 / delta^2

  list(
    n_per_group = ceiling(n),
    variance_reduction = paste0(round((1 - var_reduction) * 100), "%"),
    vs_unadjusted = ceiling(2 * sigma^2 * (z_alpha + z_beta)^2 / delta^2)
  )
}

# Example: Δ=5, σ=10, ρ=0.5
ancova_sample_size(delta = 5, sigma = 10, rho = 0.5)
# n = 48/group (vs. 64 unadjusted), 25% reduction

# Higher correlation = greater efficiency
ancova_sample_size(delta = 5, sigma = 10, rho = 0.7)
# n = 33/group, 49% reduction

Repeated Measures (Compound Symmetry)

R
repeated_measures_cs <- function(delta, sigma, m, rho, alpha = 0.05, power = 0.80) {
  z_alpha <- qnorm(1 - alpha/2)
  z_beta <- qnorm(power)

  # Efficiency factor for m measurements under CS
  efficiency <- (1 + (m - 1) * rho) / m

  # Base sample size (single measurement)
  n_base <- 2 * sigma^2 * (z_alpha + z_beta)^2 / delta^2

  # Adjusted sample size
  n <- n_base * efficiency

  list(
    n_per_group = ceiling(n),
    efficiency_factor = round(efficiency, 3),
    reduction_vs_single = paste0(round((1 - efficiency) * 100), "%")
  )
}

# Example: 4 timepoints, ρ=0.5
repeated_measures_cs(delta = 5, sigma = 10, m = 4, rho = 0.5)
# n = 40/group, 37% reduction vs single measurement

# Diminishing returns beyond 4-5 measurements
sapply(2:10, function(m) {
  repeated_measures_cs(5, 10, m, 0.5)$n_per_group
})

Slope Comparison

R
slope_comparison <- function(delta_slope, sigma, times, alpha = 0.05, power = 0.80) {
  z_alpha <- qnorm(1 - alpha/2)
  z_beta <- qnorm(power)

  # Variance of time points
  t_mean <- mean(times)
  V_t <- sum((times - t_mean)^2)

  n <- 2 * sigma^2 * (z_alpha + z_beta)^2 / (delta_slope^2 * V_t)

  list(
    n_per_group = ceiling(n),
    time_variance = V_t,
    n_timepoints = length(times)
  )
}

# Example: 4 equally-spaced times (0,1,2,3), slope difference = 2
slope_comparison(delta_slope = 2, sigma = 10, times = c(0, 1, 2, 3))
# n = 100/group

# More timepoints = more power for slope comparison
slope_comparison(delta_slope = 2, sigma = 10, times = 0:5)
# n = 35/group (6 timepoints)

Crossover Design

R
crossover_sample_size <- function(delta, sigma_within, alpha = 0.05, power = 0.80) {
  z_alpha <- qnorm(1 - alpha/2)
  z_beta <- qnorm(power)

  # 2x2 crossover formula
  n <- 2 * sigma_within^2 * (z_alpha + z_beta)^2 / delta^2

  list(
    n_total = ceiling(n),
    n_per_sequence = ceiling(n / 2)
  )
}

# Example: Δ=5, within-subject SD=6
crossover_sample_size(delta = 5, sigma_within = 6)
# n = 12 total

# Compare to parallel design
parallel_n <- 2 * ceiling(2 * 10^2 * (qnorm(0.975) + qnorm(0.8))^2 / 5^2)
crossover_n <- crossover_sample_size(5, 6)$n_total
cat("Parallel:", parallel_n, "vs Crossover:", crossover_n, "\n")

Using longpower Package

R
# install.packages("longpower")
library(longpower)

# Power for comparing slopes (linear mixed model)
power.mmrm(
  N = 100,              # Per group
  Ra = 0.5,             # Correlation in group A
  ra = 0.5,             # Autocorrelation in A
  sigmaa = 10,          # SD in group A
  Rb = 0.5,             # Correlation in group B
  rb = 0.5,             # Autocorrelation in B
  sigmab = 10,          # SD in group B
  lambda = c(0, 0, 0, 1), # Contrast (difference at time 4)
  times = c(0, 1, 2, 3),
  delta = 5,            # Effect size
  sig.level = 0.05
)

# Sample size for pilot study
liu.liang.linear.power(
  delta = 5,            # Effect size
  u = c(0, 1, 2, 3),    # Time points
  v = c(0, 1, 2, 3),    # Same for both groups
  sigma2 = 100,         # Residual variance
  R = 0.5,              # Correlation
  alternative = "two.sided",
  power = 0.80
)

Mixed Model Power with simr

R
library(lme4)
library(simr)

# Create pilot data structure
pilot_data <- expand.grid(
  subject = 1:30,
  time = c(0, 1, 2, 3)
)
pilot_data$group <- rep(c("control", "treatment"), each = 60)
pilot_data$y <- rnorm(120)

# Fit model
model <- lmer(y ~ group * time + (1 + time | subject), data = pilot_data)

# Set effect size for group:time interaction
fixef(model)["grouptreatment:time"] <- 2  # Slope difference

# Power simulation
powerSim(model, test = fixed("grouptreatment:time"), nsim = 100)

Dropout Adjustment

R
dropout_adjustment <- function(n, dropout_per_visit, n_visits, method = "conservative") {
  if (method == "conservative") {
    # Assume completers only
    completion_rate <- (1 - dropout_per_visit)^(n_visits - 1)
    n_adj <- ceiling(n / completion_rate)
  } else if (method == "optimistic") {
    # Account for partial information from dropouts
    # Using square root adjustment
    n_adj <- ceiling(n / sqrt((1 - dropout_per_visit)^(n_visits - 1)))
  }

  list(
    original_n = n,
    adjusted_n = n_adj,
    expected_completers = floor(n_adj * (1 - dropout_per_visit)^(n_visits - 1)),
    completion_rate = round((1 - dropout_per_visit)^(n_visits - 1) * 100, 1)
  )
}

# Example: 5% dropout per visit, 5 visits
dropout_adjustment(n = 50, dropout_per_visit = 0.05, n_visits = 5)

AR(1) vs Compound Symmetry Comparison

R
compare_correlation_structures <- function(delta, sigma, m, rho, alpha = 0.05, power = 0.80) {
  z_alpha <- qnorm(1 - alpha/2)
  z_beta <- qnorm(power)

  n_base <- 2 * sigma^2 * (z_alpha + z_beta)^2 / delta^2

  # Compound symmetry
  eff_cs <- (1 + (m - 1) * rho) / m
  n_cs <- ceiling(n_base * eff_cs)

  # AR(1) - correlation decays
  # Approximate efficiency (exact depends on contrast)
  ar1_corr <- sapply(1:(m-1), function(k) rho^k)
  avg_corr_ar1 <- mean(c(1, ar1_corr[1:(m-1)], ar1_corr))
  eff_ar1 <- (1 + (m - 1) * avg_corr_ar1 * 0.7) / m  # Approximation
  n_ar1 <- ceiling(n_base * eff_ar1)

  data.frame(
    structure = c("Compound Symmetry", "AR(1)"),
    n_per_group = c(n_cs, n_ar1),
    efficiency = c(eff_cs, eff_ar1)
  )
}

compare_correlation_structures(delta = 5, sigma = 10, m = 5, rho = 0.6)

8. References

Diggle PJ, Heagerty P, Liang KY, Zeger SL (2002).Analysis of Longitudinal Data, 2nd ed. Oxford University Press.

Fitzmaurice GM, Laird NM, Ware JH (2011).Applied Longitudinal Analysis, 2nd ed. Wiley.

Liu GF, Lu K, Mogg R, et al. (2009). Should baseline be a covariate or dependent variable in analyses of change from baseline in clinical trials?Statistics in Medicine, 28(20):2509-2530.

Mallinckrodt CH, et al. (2008). Recommendations for the primary analysis of continuous endpoints in longitudinal clinical trials.Drug Information Journal, 42(4):303-319.

National Research Council (2010).The Prevention and Treatment of Missing Data in Clinical Trials. National Academies Press.

Siddiqui O, Hung HMJ, O'Neill R (2009). MMRM vs. LOCF: a comprehensive comparison based on simulation study and 25 NDA datasets.Journal of Biopharmaceutical Statistics, 19(2):227-246.

Ready to Calculate?

Use our Longitudinal Calculator to determine the optimal sample size for your repeated measures study.

Longitudinal Calculator

Related Guides