Simpson's Paradox

What's Happening?

The Paradox

Treatment A has a higher success rate in BOTH small kidney stones AND large kidney stones. Yet when we combine the data, Treatment B appears to have a higher overall success rate!

Simpson's Paradox occurs when a trend that appears in separate groups of data reverses or disappears when the groups are combined. Named after statistician Edward H. Simpson (1951), this paradox demonstrates why correlation and causation require careful analysis.

The Hidden Culprit: Confounding Variables

Treatment Choice

←

Case Severity
(Confounding Variable)

→

Treatment Outcome

Doctors choose Treatment A for harder cases. This creates the illusion that A is worse.

The key insight: when there's a confounding variable that influences both which group something belongs to AND the outcome, simple aggregation gives misleading results. The confounding variable creates different weights when combining groups.

Real-World Examples

🏥 Kidney Stone Study (1986)

Treatment A: 93% success on small stones, 73% on large.
Treatment B: 87% success on small stones, 69% on large.
A wins in both! But combined, B had 83% vs A's 78%. Why? A was used more on large (hard) stones.

🎓 UC Berkeley Admissions (1973)

Overall, 44% of men admitted vs 35% of women—discrimination? But within each department, women had equal or higher rates! Women applied more to competitive departments.

⚾ Baseball Batting (1995-96)

David Justice beat Derek Jeter in batting average both years. But combined across both years, Jeter's average was higher! Jeter had more at-bats in his better year.

💉 COVID-19 Vaccines

Vaccine effectiveness seemed lower in highly vaccinated countries! The paradox: elderly (higher risk) got vaccinated first. Age was the confounding variable.

The Lesson

Always ask: "What variables am I not seeing?"

Before combining data groups, check if there's a confounding variable that creates different sample sizes or affects both the grouping and the outcome. Sometimes the grouped analysis is right; sometimes the combined is right—context is everything.

Grouped Data (By Category)

Combined Data (All Together)