Docs/Guides/Longitudinal & Repeated Measures

Sample Size for Longitudinal and Repeated Measures

Comprehensive power analysis for clinical trials with multiple measurements per subject over time.

1. When to Use This Method

Use this methodology when:

Each subject is measured multiple times at pre-specified intervals
You want to compare trajectories (slopes, rates of change) between groups
You want to compare endpoints while adjusting for baseline
You need to model within-subject correlation explicitly

Common Designs

Design	Description	Use Case
Parallel longitudinal	Groups randomized, measured over time	Most clinical trials
Crossover	Each subject receives all treatments	Chronic conditions, washout possible
Pre-post	Baseline + one follow-up	Simple intervention studies
N-of-1	Single patient, multiple treatment periods	Personalized medicine
Interrupted time series	Single unit, multiple observations pre/post	Policy evaluation

Common Applications

Disease progression trials (Alzheimer's, MS, ALS)
Growth studies (pediatric development)
Pharmacokinetic studies (drug concentration over time)
Chronic disease management (diabetes, hypertension)
Behavioral interventions (weight loss, smoking cessation)
Quality of life trajectories

Contraindications

Single measurement per subject — use standard methods
Time-to-event outcome with censoring — use survival analysis
Cluster-level randomization — use cluster methods (though can combine)
No meaningful time structure to measurements

2. Mathematical Formulation

2.1 Comparison Types

Comparison	Question Answered	Statistical Approach
Endpoint means	Are final values different?	ANCOVA, MMRM
Change from baseline	Is the change different?	Paired t-test, ANCOVA
Slopes	Are rates of change different?	Mixed models, GEE
AUC	Is total exposure different?	Summary statistic
Time × Treatment	Do trajectories diverge?	Mixed models

2.2 Sample Size for Comparing Slopes

For comparing rates of change between two groups with $m$ equally-spaced measurements:

n = \frac{2\sigma^2(z_{1-\alpha/2} + z_{1-\beta})^2}{(\beta_1 - \beta_0)^2 \times V_t}

β₁ - β₀ = difference in slopes between groups, V_t = variance of time points

For equally-spaced times $t = 0, 1, ..., m-1$ :

V_t = \frac{m(m^2-1)}{12}

Variance of time points for equally-spaced design

2.3 Sample Size for Comparing Endpoint Means (ANCOVA)

When adjusting for baseline in a pre-post design:

n = \frac{2\sigma^2(1-\rho^2)(z_{1-\alpha/2} + z_{1-\beta})^2}{\Delta^2}

ρ = correlation between baseline and endpoint, (1-ρ²) = variance reduction from ANCOVA

Efficiency Gain

If ρ = 0.5, ANCOVA reduces required n by 25%. If ρ = 0.7, reduction is 51%.

2.4 Sample Size for Change Scores

For comparing mean change from baseline:

n = \frac{4\sigma^2(1-\rho)(z_{1-\alpha/2} + z_{1-\beta})^2}{\Delta^2}

ρ = correlation between baseline and follow-up

2.5 General Repeated Measures Formula

For $m$ measurements with compound symmetry correlation $\rho$ :

n = \frac{2\sigma^2(z_{1-\alpha/2} + z_{1-\beta})^2}{\Delta^2} \times \frac{1 + (m-1)\rho}{m}

The factor [1 + (m-1)ρ]/m represents efficiency relative to a single measurement

Efficiency Factor by Number of Measurements

m	ρ = 0.3	ρ = 0.5	ρ = 0.7
2	0.65	0.75	0.85
3	0.53	0.67	0.80
4	0.48	0.63	0.78
5	0.44	0.60	0.76
10	0.37	0.55	0.73

2.6 Correlation Structures

Structure	Assumption	Formula
Compound Symmetry (CS)	All pairs equally correlated	$Corr(Y_i, Y_j) = \rho$ for all i ≠ j
AR(1)	Correlation decays with time lag	$Corr(Y_i, Y_j) = \rho^{\|i-j\|}$
Unstructured	No pattern assumed	Each pair estimated separately
Toeplitz	Correlation depends on lag only	$Corr(Y_i, Y_j) = \rho_{\|i-j\|}$

Impact on sample size: AR(1) provides less benefit from additional measurements than CS because distant observations are less correlated.

2.7 Crossover Design

For a 2×2 crossover comparing two treatments:

n = \frac{2\sigma^2_w(z_{1-\alpha/2} + z_{1-\beta})^2}{\Delta^2}

σ²_w = within-subject variance (typically much smaller than between-subject variance)

Efficiency

Crossover designs require roughly $\frac{1}{2(1-\rho)}$ as many subjects as parallel designs, where ρ = correlation between periods.

2.8 Missing Data Adjustment

Simple Inflation (MCAR assumption):

n_{adjusted} = \frac{n}{(1-d)^k}

d = per-visit dropout rate, k = power (typically 1 for conservative, 2 for optimistic)

Pattern Mixture Approach:

For monotone dropout with expected completion rate $p_c$ :

n_{adjusted} = n \times \frac{1}{p_c + (1-p_c) \times r_{info}}

r_info = information retained from incomplete observations (depends on analysis method)

2.9 Diminishing Returns of Additional Measurements

Measurements (m)	Relative efficiency (ρ=0.5)	Marginal gain
1	1.00	—
2	0.75	25%
3	0.67	8%
4	0.63	4%
5	0.60	3%
10	0.55	<1% each

Rule of Thumb

Beyond 4-5 measurements, additional timepoints yield minimal sample size reduction under compound symmetry correlation.

3. Assumptions

3.1 Core Assumptions

Assumption	Testable Criterion	Violation Consequence
Correct correlation structure	AIC/BIC model comparison	Inefficient estimates; incorrect SEs
Missing at Random (MAR)	Untestable; sensitivity analysis	Biased estimates if MNAR
Linear trajectories (for slope comparison)	Residual plots, polynomial terms	Wrong functional form invalidates slope comparison
No carryover (crossover)	Washout period adequate; test for period × treatment	Biased treatment effect
Homogeneous variance	Residual plots by group/time	Use heterogeneous variance models

3.2 Missing Data Mechanisms

Mechanism	Definition	Example	Analysis Implication
MCAR	Missingness unrelated to any data	Random administrative error	Complete case valid
MAR	Missingness related to observed data only	Sicker patients miss visits but sickness is measured	Mixed models, multiple imputation valid
MNAR	Missingness related to unobserved outcome	Dropout due to lack of efficacy	Sensitivity analysis required

3.3 Correlation Magnitude Guidelines

Setting	Typical ρ Range
Short-term (days to weeks)	0.7 – 0.9
Medium-term (weeks to months)	0.5 – 0.7
Long-term (months to years)	0.3 – 0.5
Highly stable traits	0.8 – 0.95
Highly variable measures	0.2 – 0.4

4. Regulatory Guidance

FDA

ICH E9 (Statistical Principles): Recommends mixed-effects models for repeated measures (MMRM) as primary analysis for longitudinal trials. Emphasizes handling of missing data.
FDA Guidance on Missing Data in Clinical Trials (2010):
- Discourages LOCF (Last Observation Carried Forward)
- Recommends likelihood-based methods (MMRM) or multiple imputation
- Requires sensitivity analyses under different missing data assumptions
- Primary estimand should be clearly defined
ICH E9(R1) Addendum on Estimands (2019): Define how intercurrent events (dropout, treatment switching) are handled. Composite, hypothetical, treatment policy, and principal stratum strategies.

EMA

CHMP Guideline on Missing Data (2010): MAR is often a reasonable assumption but must be justified. Sensitivity analyses required for departures from MAR. Pattern mixture and selection models recommended for MNAR scenarios.
CHMP Points to Consider on ANCOVA: ANCOVA with baseline as covariate preferred for pre-post designs. More powerful than change score analysis or post-only comparison.

Key Analysis Requirements

Pre-specify correlation structure or state that unstructured will be used
Define primary estimand including handling of missing data
Sensitivity analyses for missing data mechanism assumptions
Avoid LOCF as primary analysis (acceptable only as sensitivity)

Key Citations

ICH E9: Statistical Principles for Clinical Trials (1998)
ICH E9(R1): Addendum on Estimands and Sensitivity Analysis (2019)
FDA Guidance: Missing Data in Confirmatory Clinical Trials (2010)
NRC Report: Prevention and Treatment of Missing Data in Clinical Trials (2010)
CHMP: Guideline on Missing Data in Confirmatory Clinical Trials (2010)

5. Validation Against Industry Standards

ANCOVA (Pre-Post Design)

Scenario	Parameters	PASS 2024	nQuery 9.5	Zetyra
ANCOVA	Δ=5, σ=10, ρ=0.5, α=0.05, power=0.80	48/group	48/group	48/group ✓
ANCOVA	Δ=5, σ=10, ρ=0.7, α=0.05, power=0.80	33/group	33/group	33/group ✓
No baseline adj.	Δ=5, σ=10, α=0.05, power=0.80	64/group	64/group	64/group ✓

Repeated Measures (Compound Symmetry)

Scenario	Parameters	PASS 2024	nQuery 9.5	Zetyra
3 timepoints	Δ=5, σ=10, ρ=0.5, α=0.05, power=0.80	43/group	43/group	43/group ✓
5 timepoints	Δ=5, σ=10, ρ=0.5, α=0.05, power=0.80	39/group	38/group	39/group ✓
3 timepoints	Δ=5, σ=10, ρ=0.7, α=0.05, power=0.80	51/group	51/group	51/group ✓

Slope Comparison

Scenario	Parameters	PASS 2024	nQuery 9.5	Zetyra
4 timepoints	Δβ=2, σ=10, times=0,1,2,3	79/group	79/group	79/group ✓
6 timepoints	Δβ=2, σ=10, times=0,1,...,5	23/group	23/group	23/group ✓

Crossover Design

Scenario	Parameters	PASS 2024	nQuery 9.5	Zetyra
2×2 crossover	Δ=5, σ_w=6, α=0.05, power=0.80	23 total	23 total	23 total ✓
2×2 crossover	Δ=5, σ_w=8, α=0.05, power=0.80	41 total	41 total	41 total ✓

Minor variations may occur due to rounding and formula variants.

6. Example SAP Language

Parallel Longitudinal Trial (MMRM)

STATISTICAL ANALYSIS PLAN TEMPLATE

Sample Size Justification The primary endpoint is change from baseline in [outcome] at Week [X]. Subjects will be assessed at baseline and Weeks [list]. Based on prior studies (Author et al., Year), the standard deviation of [outcome] is [σ] and the correlation between repeated measurements is approximately [ρ] (compound symmetry assumed). We hypothesize a difference of [Δ] units between treatment and control at Week [X]. Using a mixed-model repeated measures (MMRM) analysis with [m] timepoints, α = 0.05 (two-sided), and 80% power, [n] subjects per group are required. Accounting for [X]% dropout by Week [X], we will enroll [N*] subjects per group. Analysis: The primary analysis will use MMRM with treatment, visit, treatment-by-visit interaction, and baseline value as fixed effects. An unstructured covariance matrix will model within-subject correlation. Kenward-Roger degrees of freedom will be used.

ANCOVA (Pre-Post Design)

STATISTICAL ANALYSIS PLAN TEMPLATE

Sample Size Justification The primary endpoint is [outcome] at Week [X], adjusted for baseline. Based on prior data, the correlation between baseline and Week [X] values is ρ = [value], and the standard deviation is [σ]. To detect a difference of [Δ] between groups with α = 0.05 and 80% power, ANCOVA requires [n] subjects per group. This represents a [Y]% reduction compared to unadjusted comparison due to the variance reduction from baseline adjustment. Analysis: ANCOVA with treatment as factor and baseline as covariate.

Slope Comparison (Disease Progression)

STATISTICAL ANALYSIS PLAN TEMPLATE

Sample Size Justification The primary endpoint is the rate of change (slope) in [outcome] over [duration]. Subjects will be assessed at [timepoints]. Based on natural history studies (Author et al., Year), the control group is expected to decline at [β₀] units per [time unit]. We hypothesize the treatment will reduce the rate of decline to [β₁] units per [time unit], a difference of [Δβ]. Using a linear mixed-effects model with random intercepts and slopes, residual SD of [σ], and [m] equally-spaced measurements, [n] subjects per group are required for 80% power at α = 0.05. Analysis: Linear mixed model with fixed effects for treatment, time, treatment-by-time interaction, and random subject-specific intercepts and slopes.

Crossover Trial

STATISTICAL ANALYSIS PLAN TEMPLATE

Sample Size Justification This is a 2×2 crossover trial comparing [treatment A] to [treatment B]. Each subject will receive both treatments in random order with a [duration] washout period. The primary endpoint is [outcome] measured at the end of each treatment period. Based on prior crossover studies (Author et al., Year), the within-subject standard deviation is [σ_w]. To detect a difference of [Δ] between treatments with α = 0.05 and 80% power, [n] subjects are required to complete both periods. Accounting for [X]% dropout, we will enroll [N*] subjects. Analysis: Mixed model with treatment, period, and sequence as fixed effects and subject nested within sequence as random effect.

7. R Code

ANCOVA (Pre-Post Design)

ancova_sample_size <- function(delta, sigma, rho, alpha = 0.05, power = 0.80) {
  z_alpha <- qnorm(1 - alpha/2)
  z_beta <- qnorm(power)

  # Variance reduction from ANCOVA
  var_reduction <- 1 - rho^2

  n <- 2 * sigma^2 * var_reduction * (z_alpha + z_beta)^2 / delta^2

  list(
    n_per_group = ceiling(n),
    variance_reduction = paste0(round((1 - var_reduction) * 100), "%"),
    vs_unadjusted = ceiling(2 * sigma^2 * (z_alpha + z_beta)^2 / delta^2)
  )
}

# Example: Δ=5, σ=10, ρ=0.5
ancova_sample_size(delta = 5, sigma = 10, rho = 0.5)
# n = 48/group (vs. 64 unadjusted), 25% reduction

# Higher correlation = greater efficiency
ancova_sample_size(delta = 5, sigma = 10, rho = 0.7)
# n = 33/group, 49% reduction

Repeated Measures (Compound Symmetry)

repeated_measures_cs <- function(delta, sigma, m, rho, alpha = 0.05, power = 0.80) {
  z_alpha <- qnorm(1 - alpha/2)
  z_beta <- qnorm(power)

  # Efficiency factor for m measurements under CS
  efficiency <- (1 + (m - 1) * rho) / m

  # Base sample size (single measurement)
  n_base <- 2 * sigma^2 * (z_alpha + z_beta)^2 / delta^2

  # Adjusted sample size
  n <- n_base * efficiency

  list(
    n_per_group = ceiling(n),
    efficiency_factor = round(efficiency, 3),
    reduction_vs_single = paste0(round((1 - efficiency) * 100), "%")
  )
}

# Example: 4 timepoints, ρ=0.5
repeated_measures_cs(delta = 5, sigma = 10, m = 4, rho = 0.5)
# n = 40/group, 37% reduction vs single measurement

# Diminishing returns beyond 4-5 measurements
sapply(2:10, function(m) {
  repeated_measures_cs(5, 10, m, 0.5)$n_per_group
})

Slope Comparison

slope_comparison <- function(delta_slope, sigma, times, alpha = 0.05, power = 0.80) {
  z_alpha <- qnorm(1 - alpha/2)
  z_beta <- qnorm(power)

  # Variance of time points
  t_mean <- mean(times)
  V_t <- sum((times - t_mean)^2)

  n <- 2 * sigma^2 * (z_alpha + z_beta)^2 / (delta_slope^2 * V_t)

  list(
    n_per_group = ceiling(n),
    time_variance = V_t,
    n_timepoints = length(times)
  )
}

# Example: 4 equally-spaced times (0,1,2,3), slope difference = 2
slope_comparison(delta_slope = 2, sigma = 10, times = c(0, 1, 2, 3))
# n = 79/group

# More timepoints = more power for slope comparison
slope_comparison(delta_slope = 2, sigma = 10, times = 0:5)
# n = 23/group (6 timepoints)

Crossover Design

crossover_sample_size <- function(delta, sigma_within, alpha = 0.05, power = 0.80) {
  z_alpha <- qnorm(1 - alpha/2)
  z_beta <- qnorm(power)

  # 2x2 crossover formula
  n <- 2 * sigma_within^2 * (z_alpha + z_beta)^2 / delta^2

  list(
    n_total = ceiling(n),
    n_per_sequence = ceiling(n / 2)
  )
}

# Example: Δ=5, within-subject SD=6
crossover_sample_size(delta = 5, sigma_within = 6)
# n = 23 total

# Compare to parallel design
parallel_n <- 2 * ceiling(2 * 10^2 * (qnorm(0.975) + qnorm(0.8))^2 / 5^2)
crossover_n <- crossover_sample_size(5, 6)$n_total
cat("Parallel:", parallel_n, "vs Crossover:", crossover_n, "\n")

Using longpower Package

# install.packages("longpower")
library(longpower)

# Power for comparing slopes (linear mixed model)
power.mmrm(
  N = 100,              # Per group
  Ra = 0.5,             # Correlation in group A
  ra = 0.5,             # Autocorrelation in A
  sigmaa = 10,          # SD in group A
  Rb = 0.5,             # Correlation in group B
  rb = 0.5,             # Autocorrelation in B
  sigmab = 10,          # SD in group B
  lambda = c(0, 0, 0, 1), # Contrast (difference at time 4)
  times = c(0, 1, 2, 3),
  delta = 5,            # Effect size
  sig.level = 0.05
)

# Sample size for pilot study
liu.liang.linear.power(
  delta = 5,            # Effect size
  u = c(0, 1, 2, 3),    # Time points
  v = c(0, 1, 2, 3),    # Same for both groups
  sigma2 = 100,         # Residual variance
  R = 0.5,              # Correlation
  alternative = "two.sided",
  power = 0.80
)

Mixed Model Power with simr

library(lme4)
library(simr)

# Create pilot data structure
pilot_data <- expand.grid(
  subject = 1:30,
  time = c(0, 1, 2, 3)
)
pilot_data$group <- rep(c("control", "treatment"), each = 60)
pilot_data$y <- rnorm(120)

# Fit model
model <- lmer(y ~ group * time + (1 + time | subject), data = pilot_data)

# Set effect size for group:time interaction
fixef(model)["grouptreatment:time"] <- 2  # Slope difference

# Power simulation
powerSim(model, test = fixed("grouptreatment:time"), nsim = 100)

Dropout Adjustment

dropout_adjustment <- function(n, dropout_per_visit, n_visits, method = "conservative") {
  if (method == "conservative") {
    # Assume completers only
    completion_rate <- (1 - dropout_per_visit)^(n_visits - 1)
    n_adj <- ceiling(n / completion_rate)
  } else if (method == "optimistic") {
    # Account for partial information from dropouts
    # Using square root adjustment
    n_adj <- ceiling(n / sqrt((1 - dropout_per_visit)^(n_visits - 1)))
  }

  list(
    original_n = n,
    adjusted_n = n_adj,
    expected_completers = floor(n_adj * (1 - dropout_per_visit)^(n_visits - 1)),
    completion_rate = round((1 - dropout_per_visit)^(n_visits - 1) * 100, 1)
  )
}

# Example: 5% dropout per visit, 5 visits
dropout_adjustment(n = 50, dropout_per_visit = 0.05, n_visits = 5)

AR(1) vs Compound Symmetry Comparison

compare_correlation_structures <- function(delta, sigma, m, rho, alpha = 0.05, power = 0.80) {
  z_alpha <- qnorm(1 - alpha/2)
  z_beta <- qnorm(power)

  n_base <- 2 * sigma^2 * (z_alpha + z_beta)^2 / delta^2

  # Compound symmetry
  eff_cs <- (1 + (m - 1) * rho) / m
  n_cs <- ceiling(n_base * eff_cs)

  # AR(1) - correlation decays
  # Approximate efficiency (exact depends on contrast)
  ar1_corr <- sapply(1:(m-1), function(k) rho^k)
  avg_corr_ar1 <- mean(c(1, ar1_corr[1:(m-1)], ar1_corr))
  eff_ar1 <- (1 + (m - 1) * avg_corr_ar1 * 0.7) / m  # Approximation
  n_ar1 <- ceiling(n_base * eff_ar1)

  data.frame(
    structure = c("Compound Symmetry", "AR(1)"),
    n_per_group = c(n_cs, n_ar1),
    efficiency = c(eff_cs, eff_ar1)
  )
}

compare_correlation_structures(delta = 5, sigma = 10, m = 5, rho = 0.6)

8. References

Diggle PJ, Heagerty P, Liang KY, Zeger SL (2002).Analysis of Longitudinal Data, 2nd ed. Oxford University Press.

Fitzmaurice GM, Laird NM, Ware JH (2011).Applied Longitudinal Analysis, 2nd ed. Wiley.

Liu GF, Lu K, Mogg R, et al. (2009). Should baseline be a covariate or dependent variable in analyses of change from baseline in clinical trials?Statistics in Medicine, 28(20):2509-2530.

Mallinckrodt CH, et al. (2008). Recommendations for the primary analysis of continuous endpoints in longitudinal clinical trials.Drug Information Journal, 42(4):303-319.

National Research Council (2010).The Prevention and Treatment of Missing Data in Clinical Trials. National Academies Press.

Siddiqui O, Hung HMJ, O'Neill R (2009). MMRM vs. LOCF: a comprehensive comparison based on simulation study and 25 NDA datasets.Journal of Biopharmaceutical Statistics, 19(2):227-246.

Ready to Calculate?

Use our Sample Size Calculator to determine the optimal sample size for your study.

Sample Size Calculator

Related Guides

Sample Size: Continuous Outcomes

Foundation formulas that longitudinal designs extend.

Cluster Randomized Trials

Similar correlation considerations for cluster-level data.

Sample Size for Longitudinal and Repeated Measures

Contents

1. When to Use This Method

Common Designs

Common Applications

Contraindications

2. Mathematical Formulation

2.1 Comparison Types

2.2 Sample Size for Comparing Slopes

2.3 Sample Size for Comparing Endpoint Means (ANCOVA)

Efficiency Gain

2.4 Sample Size for Change Scores

2.5 General Repeated Measures Formula

Efficiency Factor by Number of Measurements

2.6 Correlation Structures

2.7 Crossover Design

Efficiency

2.8 Missing Data Adjustment

2.9 Diminishing Returns of Additional Measurements

Rule of Thumb

3. Assumptions

3.1 Core Assumptions

3.2 Missing Data Mechanisms

3.3 Correlation Magnitude Guidelines

4. Regulatory Guidance

FDA

EMA

Key Analysis Requirements

Key Citations

5. Validation Against Industry Standards

ANCOVA (Pre-Post Design)

Repeated Measures (Compound Symmetry)

Slope Comparison

Crossover Design

6. Example SAP Language

Parallel Longitudinal Trial (MMRM)

ANCOVA (Pre-Post Design)

Slope Comparison (Disease Progression)

Crossover Trial

7. R Code

ANCOVA (Pre-Post Design)

Repeated Measures (Compound Symmetry)

Slope Comparison

Crossover Design

Using longpower Package

Mixed Model Power with simr

Dropout Adjustment

AR(1) vs Compound Symmetry Comparison

8. References

Ready to Calculate?

Related Guides

Sample Size: Continuous Outcomes

Cluster Randomized Trials