Docs/Guides/CUPED

A Complete Guide to CUPED

A comprehensive overview of CUPED (Controlled-experiment Using Pre-experiment Data), a variance reduction technique used to increase the sensitivity and trustworthiness of controlled experiments.

I. Definition and Core Concept

CUPED is a statistical method designed to improve the power of an experiment—the probability of detecting a true treatment effect when it exists. It does this by using data collected before the experiment begins (pre-experiment data) to reduce the variance of the metrics analyzed during the experiment.

In online experimentation, the effects we care about are often small—frequently in the 0.1%–2% range. These signals are easy to miss because they are buried in random variation. Because the relationship between sample size and the standard deviation of an outcome is quadratic, even modest reductions in variance can translate into large reductions in the number of users or the duration required to reach a decision.

Analogy

Using CUPED is like noise-canceling headphones for your data. Pre-experiment behavior identifies the steady background hum of a user's normal activity. By subtracting that hum, the treatment signal becomes much easier to hear.

II. The Mechanism: Variance Reduction

CUPED works by explaining away part of the variability in the primary outcome metric (YY) using a baseline covariate (XX)—typically the same metric measured during a period immediately preceding the experiment.

1

Correlation is key

The effectiveness of CUPED depends on the correlation (ρ\rho) between the pre-experiment data and the experiment-period data.

2

Adjusted metric

CUPED constructs an adjusted version of the outcome (Y~\tilde{Y}) by removing the component of YY that can be predicted from baseline behavior.

3

Efficiency gain

The variance of the adjusted metric is approximately (1ρ2)(1 - \rho^2) times the original variance.

Building Intuition

Correlation (ρ\rho)Variance ReductionInterpretation
0.5~25%Moderate baseline stability
0.8~64%High baseline stability

This reduction directly lowers the standard error, yielding tighter confidence intervals and higher statistical power for the same sample size.

III. CUPED vs. Traditional Baseline Adjustments

CUPED is best understood as a large-scale application of ANCOVA (Analysis of Covariance). It improves upon simpler baseline-adjustment approaches such as difference scores.

Difference Scores (YXY - X)

This approach subtracts the baseline value from the outcome. Its variance is proportional to:

2σ2(1ρ)2\sigma^2(1 - \rho)

Difference scores are only more efficient than a simple post-test mean when the correlation between XX and YY exceeds 0.5. Below that threshold, they can actually increase noise.

CUPED / ANCOVA

CUPED uses a regression-based adjustment that assigns the optimal weight to the baseline covariate.

Never worse than unadjusted mean
Usually better than difference scores
Optimal at any correlation level

Key Insight: In practice, CUPED dominates difference scores in almost all realistic experimentation settings.

IV. How to Implement CUPED

A practical CUPED strategy typically follows these steps:

1

Define the Overall Evaluation Criterion (OEC)

Identify the primary metric the experiment is designed to move (e.g., revenue per user, sessions per user).

2

Collect pre-experiment covariates

Gather historical data for the same metric at the same unit of analysis (usually users) prior to experiment start.

3

Ensure unit consistency

The randomization unit must be identical in the pre-period and experiment period.

4

Estimate the adjustment coefficient (bb)

Compute:

b=Cov(X,Y)Var(X)b = \frac{\text{Cov}(X, Y)}{\text{Var}(X)}

Using historical or pooled pre-period data.

5

Compute adjusted outcomes

For each experimental unit:

Y~=Yb(XE[X])\tilde{Y} = Y - b(X - E[X])
6

Run the experiment analysis

Perform a standard two-sample t-test (or regression) on the adjusted outcomes.

Because CUPED reduces variance, the resulting test will have a smaller standard error, narrower confidence intervals, and more sensitivity to small effects.

Minimal Pseudocode

# X: pre-period metric
# Y: experiment-period metric
b = cov(X, Y) / var(X)
Y_tilde = Y - b * (X - mean(X))

# Use Y_tilde in your standard A/B test

V. When Not to Use CUPED

CUPED is powerful, but it is not universally helpful. Situations where it may offer little or no benefit include:

Low pre/post correlation

If user behavior is highly volatile or the metric has weak temporal stability, variance reduction will be minimal.

Mismatched units

If the pre-period data cannot be reliably joined to experiment units, the adjustment may introduce bias or noise.

Metrics with structural breaks

If the experiment fundamentally changes how a metric is generated (e.g., redefining the metric itself), pre-period values may no longer be predictive.

Very short pre-periods

Insufficient historical data can lead to unstable coefficient estimates.

Note: CUPED assumes a linear relationship between baseline behavior and outcomes. While this is often a good approximation at scale, it should be validated rather than assumed.

VI. Strategic and Financial Benefits

Organizations that adopt CUPED consistently realize several advantages:

Higher sensitivity

Smaller, previously undetectable effects become measurable.

Shorter experiments

Decisions can be reached faster with the same level of confidence.

Lower opportunity cost

Fewer users remain in control while improvements are delayed.

Cost efficiency

Leveraging existing data is often far cheaper than acquiring additional experimental traffic.

In mature experimentation programs, CUPED is less a nice-to-have optimization and more a baseline expectation—an example of statistical rigor that compounds over dozens of experiments.

VII. Summary and Next Steps

This guide focuses on the why and how of CUPED, but it is intentionally opinionated about scope. A few things you may notice are absent:

No worked numerical example. Step-by-step tables with concrete numbers can make the math feel more tangible, but they tend to lock the discussion into a single metric and time window. This guide prioritizes generalizability over a single illustrative case.
No visualizations. Before/after variance plots are compelling and useful in practice, but they belong closer to tooling and dashboards than to a conceptual reference. Here, the goal is to build correct intuition first.
Concise by design. Shorter content trades off some SEO surface area in exchange for clarity. The intent is to be a canonical explanation readers can return to—not an exhaustive encyclopedia entry.

If you want to move from theory to application, the natural next step is to quantify how much variance reduction you can realistically expect.

Ready to estimate the sample size reduction for your experiment?

Use the CUPED Calculator to model different correlation scenarios and see how variance reduction translates into faster, more sensitive experiments.

Open CUPED Calculator