Docs/Guides/Chi-Square Tests

Chi-Square Tests for Categorical Data

A complete guide to chi-square tests of independence and homogeneity, effect size measures, Fisher's exact test, power analysis, sample size determination, and McNemar's test for paired proportions.

1. Overview

The chi-square test is one of the most widely used statistical procedures for analyzing categorical data. It evaluates whether the observed frequencies in a contingency table differ significantly from the frequencies we would expect if the row and column variables were independent.

Two Interpretations

  • Test of independence — a single sample is cross-classified on two categorical variables. Is there an association between them?
  • Test of homogeneity — independent samples from two or more populations are compared on one categorical variable. Do the populations share the same distribution?

Despite the different study designs, the arithmetic is identical: both compute the same Pearson chi-square statistic from a contingency table.

When to Use Chi-Square Tests

Comparing response rates between treatment groups
Testing whether adverse event incidence differs by group
A/B testing conversion rates across variants
Survey data: opinion by demographic category

Tip: Our Chi-Square Calculator supports four modes: Test (2x2 and RxC), Power Analysis, Sample Size, and McNemar's Test — all computed entirely in the browser.

2. The 2x2 Contingency Table

The simplest contingency table is the 2x2 layout, comparing two groups on a binary outcome. Each cell contains the count of observations falling in that row-column combination.

Outcome +Outcome −Row Total
Group 1aba + b
Group 2cdc + d
Col Totala + cb + dN

Expected Counts

Under the null hypothesis of independence, the expected count for each cell is:

Eij=(Row i total)×(Col j total)NE_{ij} = \frac{(\text{Row } i \text{ total}) \times (\text{Col } j \text{ total})}{N}

Degrees of Freedom

For a 2x2 table the degrees of freedom are:

df=(R1)(C1)=(21)(21)=1df = (R - 1)(C - 1) = (2 - 1)(2 - 1) = 1

Yates' Continuity Correction

Because the chi-square distribution is continuous but the test statistic is computed from discrete counts, Yates (1934) proposed subtracting 0.5 from each absolute deviation before squaring. For the 2x2 case:

χYates2=(OijEij0.5)2Eij\chi^2_{\text{Yates}} = \sum \frac{(|O_{ij} - E_{ij}| - 0.5)^2}{E_{ij}}

Note: Yates' correction is conservative — it reduces the test statistic, making it harder to reject H₀. Many modern references recommend using the uncorrected statistic or Fisher's exact test instead.

3. Chi-Square Test Statistic

The Pearson chi-square statistic measures the overall discrepancy between observed and expected counts across all cells in an R×CR \times C contingency table:

χ2=i=1Rj=1C(OijEij)2Eij\chi^2 = \sum_{i=1}^{R} \sum_{j=1}^{C} \frac{(O_{ij} - E_{ij})^2}{E_{ij}}

Degrees of Freedom

The general formula for degrees of freedom is:

df=(R1)(C1)df = (R - 1)(C - 1)

For example, a 3x4 table has (31)(41)=6(3-1)(4-1) = 6 degrees of freedom.

P-value and Decision Rule

The p-value is the probability of observing a chi-square statistic as large or larger than the computed value under the null hypothesis:

p=P(χdf2χobs2)p = P(\chi^2_{df} \geq \chi^2_{\text{obs}})

Reject H0H_0 (independence) when p<αp < \alpha. The chi-square test is always one-tailed (right tail) because larger values indicate greater deviation from independence.

Key insight: A statistically significant chi-square result tells you the variables are associated but says nothing about the direction or strength of the association. For that you need effect size measures (Section 4).

4. Effect Sizes

Statistical significance depends on sample size: a trivially small association will reach p<0.05p < 0.05 with a large enough N. Effect size measures quantify how strong the association is, independent of sample size.

Phi Coefficient (2x2 tables)

For 2x2 tables, the phi coefficient is the correlation between two binary variables:

φ=χ2N\varphi = \sqrt{\frac{\chi^2}{N}}

Cramér's V (RxC tables)

Cramér's V generalizes phi to tables larger than 2x2 by normalizing by the smaller table dimension:

V=χ2N×min(R1,C1)V = \sqrt{\frac{\chi^2}{N \times \min(R-1,\, C-1)}}

Cohen's Benchmarks

Effect SizeSmallMediumLarge
w (phi / V)0.10.30.5

Odds Ratio (2x2 tables)

The odds ratio quantifies the multiplicative change in odds of the outcome between the two groups:

OR=a×db×cOR = \frac{a \times d}{b \times c}

Relative Risk (2x2 tables)

The relative risk (risk ratio) compares proportions directly:

RR=a/(a+b)c/(c+d)RR = \frac{a / (a + b)}{c / (c + d)}

Note: Odds ratio and relative risk are only meaningful for 2x2 tables. For larger tables, use Cramér's V or examine standardized residuals cell by cell.

5. Fisher's Exact Test

The Pearson chi-square relies on a large-sample approximation: the statistic is approximately chi-square distributed. When expected cell counts are small (commonly < 5), this approximation breaks down. Fisher's exact test avoids the approximation entirely.

Hypergeometric Distribution

Given fixed marginal totals, the probability of observing cell count aa follows the hypergeometric distribution:

P(a)=(a+ba)(c+dc)(Na+c)P(a) = \frac{\binom{a+b}{a}\binom{c+d}{c}}{\binom{N}{a+c}}

The exact p-value sums the probabilities of all tables as extreme as or more extreme than the observed table, conditional on the margins.

When to Prefer Fisher's Over Chi-Square

  • Any expected cell count is less than 5
  • Total sample size is less than about 20–30
  • The table is very unbalanced (one margin much larger than the other)
  • You want an exact p-value rather than an asymptotic approximation

Tip: Our calculator automatically reports both Pearson chi-square and Fisher's exact p-values for 2x2 tables. If they disagree meaningfully, trust the Fisher's exact result.

6. Power Analysis

Power is the probability of correctly rejecting H0H_0 when the alternative is true. For chi-square tests, power depends on the sample size, the significance level, the degrees of freedom, and the effect size.

Cohen's Effect Size w

Cohen defined effect size ww for chi-square tests as:

w=i=1m(p0ip1i)2p0iw = \sqrt{\sum_{i=1}^{m} \frac{(p_{0i} - p_{1i})^2}{p_{0i}}}

where p0ip_{0i} are the cell probabilities under H0H_0 and p1ip_{1i} are the cell probabilities under H1H_1. For a 2x2 table, w=φw = |\varphi|.

Non-centrality Parameter

Under the alternative hypothesis, the test statistic follows a non-central chi-square distribution with non-centrality parameter:

λ=N×w2\lambda = N \times w^2

Power Calculation

Power equals the probability that a non-central chi-square variate exceeds the critical value from the central distribution:

Power=P ⁣(χdf,λ2>χdf,α2)\text{Power} = P\!\left(\chi^2_{df,\lambda} > \chi^2_{df,\alpha}\right)

Sample Size Determination

To find the minimum NN for a desired power 1β1 - \beta, invert the power equation. A useful approximation:

Nχdf,α2+z1β2dfw2N \approx \frac{\chi^2_{df,\alpha} + z_{1-\beta}\sqrt{2 \cdot df}}{w^2}

Note: This approximation works well for df2df \geq 2. The calculator uses a normal approximation with Newton refinement for the non-centrality parameter.

7. McNemar's Test

McNemar's test is designed for paired binary data — before/after measurements on the same subjects, matched case-control studies, or any design where each observation in one condition is paired with an observation in the other.

The Discordant Pairs

Consider paired binary outcomes arranged in a 2x2 table:

After +After −
Before +a (concordant)b (discordant)
Before −c (discordant)d (concordant)

Only the discordant pairs (bb and cc) carry information about change. The test statistic is:

χMcNemar2=(bc)2b+c\chi^2_{\text{McNemar}} = \frac{(b - c)^2}{b + c}

This follows a chi-square distribution with 1 degree of freedom under H0:pb=pcH_0: p_b = p_c (i.e., the probability of change in each direction is equal).

When to Use McNemar vs Standard Chi-Square

  • McNemar: paired or matched data — the same subjects measured at two time points, or case-control pairs
  • Standard chi-square: independent groups — different subjects in each cell of the contingency table

Warning: Applying a standard chi-square test to paired data ignores the within-pair correlation and can produce misleading results. Always check whether your data are paired before choosing a test.

8. Assumptions & Limitations

Key Assumptions

  • Independence: each observation is independent of every other. Clustered or repeated-measures data violate this assumption.
  • Expected cell counts: the chi-square approximation requires all expected counts to be reasonably large. The common rule of thumb is Eij5E_{ij} \geq 5 for all cells.
  • Fixed margins: the total sample size (and sometimes row/column totals) is fixed by design.
  • Mutually exclusive categories: each observation falls into exactly one cell.

Alternatives When Assumptions Fail

SituationAlternative Method
Small expected counts (< 5)Fisher's exact test
Paired or matched dataMcNemar's test
Prefer likelihood-based testG-test (log-likelihood ratio)
Stratified / confounded dataCochran-Mantel-Haenszel test
Ordered categoriesCochran-Armitage trend test

Rule of thumb: If more than 20% of expected cell counts fall below 5, or any expected count is below 1, do not rely on the chi-square approximation — use Fisher's exact test or collapse categories.

9. API Reference

The Chi-Square Calculator runs entirely in the browser — there is no backend API. State is captured in URL parameters so results can be shared via links. Below are the four modes and their parameters.

Mode 1: Test (2x2 and RxC)

ParameterTypeDescription
modestring"test"
rowsnumberNumber of rows (2–10)
colsnumberNumber of columns (2–10)
datastringComma-separated cell values (row-major order)
alphanumberSignificance level (default 0.05)

Outputs: chi-square statistic, df, p-value, phi (2x2), Cramér's V, odds ratio (2x2), relative risk (2x2), Fisher's exact p-value (2x2), expected counts.

Example URL: /calculators/chi-square?mode=test&rows=2&cols=2&data=50,10,30,20&alpha=0.05

Mode 2: Power Analysis

ParameterTypeDescription
modestring"power"
wnumberEffect size w
alphanumberSignificance level (default 0.05)
powernumberTarget power (default 0.80)
dfnumberDegrees of freedom (default 1)

Outputs: required total sample size N, non-centrality parameter, critical value.

Example URL: /calculators/chi-square/power?w=0.3&alpha=0.05&power=0.8&df=1

Mode 3: Sample Size

ParameterTypeDescription
ssrowsnumberNumber of rows (default 2)
sscolsnumberNumber of columns (default 2)
expectedstringComma-separated expected proportions (row-major)
alphanumberSignificance level (default 0.05)
powernumberDesired power (default 0.80)

Outputs: required total sample size, per-cell N, derived Cohen's w, chi-square critical value.

Example URL: /calculators/chi-square/sample-size?ssrows=2&sscols=2&expected=0.3,0.2,0.2,0.3&alpha=0.05&power=0.8

Mode 4: McNemar's Test

ParameterTypeDescription
modestring"mcnemar"
anumberCount: +/+ (concordant)
bnumberCount: +/− (discordant)
cnumberCount: −/+ (discordant)
dnumberCount: −/− (concordant)

Outputs: McNemar chi-square statistic, p-value, discordant pair count, exact binomial p-value.

Example URL: /calculators/chi-square?mode=mcnemar&a=40&b=15&c=5&d=30

10. References

  1. Pearson K. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philosophical Magazine. 1900;50(302):157–175.
  2. Fisher RA. On the interpretation of chi-square from contingency tables, and the calculation of P. Journal of the Royal Statistical Society. 1922;85(1):87–94.
  3. Yates F. Contingency tables involving small numbers and the chi-square test. Supplement to the Journal of the Royal Statistical Society. 1934;1(2):217–235.
  4. Cramér H. Mathematical Methods of Statistics. Princeton University Press; 1946.
  5. McNemar Q. Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika. 1947;12(2):153–157.
  6. Cochran WG. Some methods for strengthening the common chi-square tests. Biometrics. 1954;10(4):417–451.
  7. Cohen J. Statistical Power Analysis for the Behavioral Sciences. 2nd ed. Lawrence Erlbaum Associates; 1988.

Last updated: April 2026 | Pearson (1900), Cohen (1988)

Ready to run your chi-square analysis?

Use our Chi-Square Calculator for contingency table tests, power analysis, sample size determination, and McNemar's test — all computed in the browser.

Open Chi-Square Calculator