One Box or Two? A War Between Decision Theories
William Newcomb, a physicist at Lawrence Livermore Laboratory, devised this problem in 1960. It was later analyzed by philosopher Robert Nozick and popularized by Martin Gardner in Scientific American (1973).
Before you are two boxes. Box A is transparent and contains $1,000. Box B is opaque and contains either $1,000,000 or $0.
A Predictor—a being, superintelligence, or perfect simulation—has already examined your brain, psychology, and decision-making tendencies. It has made its prediction and filled the boxes accordingly:
The Predictor has been right in 99.9% of all previous cases. The boxes are already filled. Your choice cannot change what's in them. What do you do?
Evidential reasoning: One-boxers almost always get $1,000,000. Two-boxers almost always get $1,000. The evidence is overwhelming—one-boxing leads to riches.
Decision theory: This aligns with Evidential Decision Theory— choose the action that is evidence of good outcomes.
Causal reasoning: The Predictor has already made its decision. The boxes are already filled. Your choice now cannot change the past.
Decision theory: This aligns with Causal Decision Theory— choose the action that causally produces better outcomes.
Newcomb's Problem isn't just a puzzle—it exposes a fundamental divide in how we reason about decisions. Causal Decision Theory says you should choose actions that cause good outcomes. Evidential Decision Theory says you should choose actions that are evidence of good outcomes.
In most cases, these agree. But Newcomb's Problem pits them against each other. And the stakes extend beyond philosophy—questions about AI alignment, precommitment, and strategic interaction all touch on similar issues.
If you're building an AI that will face predictors (other AIs, game-theoretic opponents, or even its own future self), which decision theory should it use? The debate continues.