Chi-Square Tests for Categorical Data
A complete guide to chi-square tests of independence and homogeneity, effect size measures, Fisher's exact test, power analysis, sample size determination, and McNemar's test for paired proportions.
Contents
1. Overview
The chi-square test is one of the most widely used statistical procedures for analyzing categorical data. It evaluates whether the observed frequencies in a contingency table differ significantly from the frequencies we would expect if the row and column variables were independent.
Two Interpretations
- •Test of independence — a single sample is cross-classified on two categorical variables. Is there an association between them?
- •Test of homogeneity — independent samples from two or more populations are compared on one categorical variable. Do the populations share the same distribution?
Despite the different study designs, the arithmetic is identical: both compute the same Pearson chi-square statistic from a contingency table.
When to Use Chi-Square Tests
Tip: Our Chi-Square Calculator supports four modes: Test (2x2 and RxC), Power Analysis, Sample Size, and McNemar's Test — all computed entirely in the browser.
2. The 2x2 Contingency Table
The simplest contingency table is the 2x2 layout, comparing two groups on a binary outcome. Each cell contains the count of observations falling in that row-column combination.
| Outcome + | Outcome − | Row Total | |
|---|---|---|---|
| Group 1 | a | b | a + b |
| Group 2 | c | d | c + d |
| Col Total | a + c | b + d | N |
Expected Counts
Under the null hypothesis of independence, the expected count for each cell is:
Degrees of Freedom
For a 2x2 table the degrees of freedom are:
Yates' Continuity Correction
Because the chi-square distribution is continuous but the test statistic is computed from discrete counts, Yates (1934) proposed subtracting 0.5 from each absolute deviation before squaring. For the 2x2 case:
Note: Yates' correction is conservative — it reduces the test statistic, making it harder to reject H₀. Many modern references recommend using the uncorrected statistic or Fisher's exact test instead.
3. Chi-Square Test Statistic
The Pearson chi-square statistic measures the overall discrepancy between observed and expected counts across all cells in an contingency table:
Degrees of Freedom
The general formula for degrees of freedom is:
For example, a 3x4 table has degrees of freedom.
P-value and Decision Rule
The p-value is the probability of observing a chi-square statistic as large or larger than the computed value under the null hypothesis:
Reject (independence) when . The chi-square test is always one-tailed (right tail) because larger values indicate greater deviation from independence.
Key insight: A statistically significant chi-square result tells you the variables are associated but says nothing about the direction or strength of the association. For that you need effect size measures (Section 4).
4. Effect Sizes
Statistical significance depends on sample size: a trivially small association will reach with a large enough N. Effect size measures quantify how strong the association is, independent of sample size.
Phi Coefficient (2x2 tables)
For 2x2 tables, the phi coefficient is the correlation between two binary variables:
Cramér's V (RxC tables)
Cramér's V generalizes phi to tables larger than 2x2 by normalizing by the smaller table dimension:
Cohen's Benchmarks
| Effect Size | Small | Medium | Large |
|---|---|---|---|
| w (phi / V) | 0.1 | 0.3 | 0.5 |
Odds Ratio (2x2 tables)
The odds ratio quantifies the multiplicative change in odds of the outcome between the two groups:
Relative Risk (2x2 tables)
The relative risk (risk ratio) compares proportions directly:
Note: Odds ratio and relative risk are only meaningful for 2x2 tables. For larger tables, use Cramér's V or examine standardized residuals cell by cell.
5. Fisher's Exact Test
The Pearson chi-square relies on a large-sample approximation: the statistic is approximately chi-square distributed. When expected cell counts are small (commonly < 5), this approximation breaks down. Fisher's exact test avoids the approximation entirely.
Hypergeometric Distribution
Given fixed marginal totals, the probability of observing cell count follows the hypergeometric distribution:
The exact p-value sums the probabilities of all tables as extreme as or more extreme than the observed table, conditional on the margins.
When to Prefer Fisher's Over Chi-Square
- •Any expected cell count is less than 5
- •Total sample size is less than about 20–30
- •The table is very unbalanced (one margin much larger than the other)
- •You want an exact p-value rather than an asymptotic approximation
Tip: Our calculator automatically reports both Pearson chi-square and Fisher's exact p-values for 2x2 tables. If they disagree meaningfully, trust the Fisher's exact result.
6. Power Analysis
Power is the probability of correctly rejecting when the alternative is true. For chi-square tests, power depends on the sample size, the significance level, the degrees of freedom, and the effect size.
Cohen's Effect Size w
Cohen defined effect size for chi-square tests as:
where are the cell probabilities under and are the cell probabilities under . For a 2x2 table, .
Non-centrality Parameter
Under the alternative hypothesis, the test statistic follows a non-central chi-square distribution with non-centrality parameter:
Power Calculation
Power equals the probability that a non-central chi-square variate exceeds the critical value from the central distribution:
Sample Size Determination
To find the minimum for a desired power , invert the power equation. A useful approximation:
Note: This approximation works well for . The calculator uses a normal approximation with Newton refinement for the non-centrality parameter.
7. McNemar's Test
McNemar's test is designed for paired binary data — before/after measurements on the same subjects, matched case-control studies, or any design where each observation in one condition is paired with an observation in the other.
The Discordant Pairs
Consider paired binary outcomes arranged in a 2x2 table:
| After + | After − | |
|---|---|---|
| Before + | a (concordant) | b (discordant) |
| Before − | c (discordant) | d (concordant) |
Only the discordant pairs ( and ) carry information about change. The test statistic is:
This follows a chi-square distribution with 1 degree of freedom under (i.e., the probability of change in each direction is equal).
When to Use McNemar vs Standard Chi-Square
- •McNemar: paired or matched data — the same subjects measured at two time points, or case-control pairs
- •Standard chi-square: independent groups — different subjects in each cell of the contingency table
Warning: Applying a standard chi-square test to paired data ignores the within-pair correlation and can produce misleading results. Always check whether your data are paired before choosing a test.
8. Assumptions & Limitations
Key Assumptions
- •Independence: each observation is independent of every other. Clustered or repeated-measures data violate this assumption.
- •Expected cell counts: the chi-square approximation requires all expected counts to be reasonably large. The common rule of thumb is for all cells.
- •Fixed margins: the total sample size (and sometimes row/column totals) is fixed by design.
- •Mutually exclusive categories: each observation falls into exactly one cell.
Alternatives When Assumptions Fail
| Situation | Alternative Method |
|---|---|
| Small expected counts (< 5) | Fisher's exact test |
| Paired or matched data | McNemar's test |
| Prefer likelihood-based test | G-test (log-likelihood ratio) |
| Stratified / confounded data | Cochran-Mantel-Haenszel test |
| Ordered categories | Cochran-Armitage trend test |
Rule of thumb: If more than 20% of expected cell counts fall below 5, or any expected count is below 1, do not rely on the chi-square approximation — use Fisher's exact test or collapse categories.
9. API Reference
The Chi-Square Calculator runs entirely in the browser — there is no backend API. State is captured in URL parameters so results can be shared via links. Below are the four modes and their parameters.
Mode 1: Test (2x2 and RxC)
| Parameter | Type | Description |
|---|---|---|
| mode | string | "test" |
| rows | number | Number of rows (2–10) |
| cols | number | Number of columns (2–10) |
| data | string | Comma-separated cell values (row-major order) |
| alpha | number | Significance level (default 0.05) |
Outputs: chi-square statistic, df, p-value, phi (2x2), Cramér's V, odds ratio (2x2), relative risk (2x2), Fisher's exact p-value (2x2), expected counts.
Example URL: /calculators/chi-square?mode=test&rows=2&cols=2&data=50,10,30,20&alpha=0.05
Mode 2: Power Analysis
| Parameter | Type | Description |
|---|---|---|
| mode | string | "power" |
| w | number | Effect size w |
| alpha | number | Significance level (default 0.05) |
| power | number | Target power (default 0.80) |
| df | number | Degrees of freedom (default 1) |
Outputs: required total sample size N, non-centrality parameter, critical value.
Example URL: /calculators/chi-square/power?w=0.3&alpha=0.05&power=0.8&df=1
Mode 3: Sample Size
| Parameter | Type | Description |
|---|---|---|
| ssrows | number | Number of rows (default 2) |
| sscols | number | Number of columns (default 2) |
| expected | string | Comma-separated expected proportions (row-major) |
| alpha | number | Significance level (default 0.05) |
| power | number | Desired power (default 0.80) |
Outputs: required total sample size, per-cell N, derived Cohen's w, chi-square critical value.
Example URL: /calculators/chi-square/sample-size?ssrows=2&sscols=2&expected=0.3,0.2,0.2,0.3&alpha=0.05&power=0.8
Mode 4: McNemar's Test
| Parameter | Type | Description |
|---|---|---|
| mode | string | "mcnemar" |
| a | number | Count: +/+ (concordant) |
| b | number | Count: +/− (discordant) |
| c | number | Count: −/+ (discordant) |
| d | number | Count: −/− (concordant) |
Outputs: McNemar chi-square statistic, p-value, discordant pair count, exact binomial p-value.
Example URL: /calculators/chi-square?mode=mcnemar&a=40&b=15&c=5&d=30
10. References
- Pearson K. On the criterion that a given system of deviations from the probable in the case of a correlated system of variables is such that it can be reasonably supposed to have arisen from random sampling. Philosophical Magazine. 1900;50(302):157–175.
- Fisher RA. On the interpretation of chi-square from contingency tables, and the calculation of P. Journal of the Royal Statistical Society. 1922;85(1):87–94.
- Yates F. Contingency tables involving small numbers and the chi-square test. Supplement to the Journal of the Royal Statistical Society. 1934;1(2):217–235.
- Cramér H. Mathematical Methods of Statistics. Princeton University Press; 1946.
- McNemar Q. Note on the sampling error of the difference between correlated proportions or percentages. Psychometrika. 1947;12(2):153–157.
- Cochran WG. Some methods for strengthening the common chi-square tests. Biometrics. 1954;10(4):417–451.
- Cohen J. Statistical Power Analysis for the Behavioral Sciences. 2nd ed. Lawrence Erlbaum Associates; 1988.
Last updated: April 2026 | Pearson (1900), Cohen (1988)
Related Documentation
Ready to run your chi-square analysis?
Use our Chi-Square Calculator for contingency table tests, power analysis, sample size determination, and McNemar's test — all computed in the browser.
Open Chi-Square Calculator