← Back to Surprising Paradoxes

The Coupon Collector's Problem

Why does completing a collection take SO much longer than expected? If there are 50 coupons, you don't need 50 draws—you need ~225! The last few items are BRUTAL.

0 / 25
Draws Made
0
Collected
0
Remaining
25

Collection Settings

More coupons = more frustration!

Expected Draws Needed

~95
n × (ln(n) + γ) ≈ n × ln(n) + 0.577n

Current Phase

Start drawing to begin collection!

Session Statistics

Collections completed: 0
Average draws needed:
Best run:
Worst run:
Efficiency vs expected:

Draws Per New Coupon

First 50%:
Next 25%:
Final 25%:
Last coupon alone:

The Mathematics of Frustrating Collections

Every Pokemon card collector, gacha gamer, or breakfast cereal enthusiast knows the pain: those last few items to complete a set take forever. The Coupon Collector's Problem explains exactly why.

The Problem

Suppose a cereal box contains one of n different coupons, each equally likely. How many boxes must you buy on average to collect all n coupons?

The Answer: E[T] = n × Hₙ ≈ n × (ln n + γ)

Where Hₙ is the nth Harmonic number and γ ≈ 0.577216 is the Euler-Mascheroni constant.

For 50 coupons: E[T] ≈ 50 × (ln 50 + 0.577) ≈ 225 draws!

Why "n × log(n)" Instead of Just "n"?

The problem is intuitive once you break it down by stages:

The total is: 1 + n/(n-1) + n/(n-2) + ... + n/1 = n × (1 + 1/2 + 1/3 + ... + 1/n) = n × Hₙ

The "Almost Complete" Torture

This is where the frustration lives. Consider collecting 50 coupons:

Phase Breakdown (n=50):
• First 25 coupons (50%): ~35 draws
• Next 12 coupons (75%): ~40 draws
• Next 8 coupons (90%): ~50 draws
• Final 5 coupons (100%): ~100 draws!

The last 10% of the collection takes nearly HALF of all draws!

Real-World Applications

Gacha Games: Mobile games with random character draws exploit this psychology. Getting the "last few" characters is designed to be expensive.

McDonald's Monopoly: The famous promotion keeps certain rare pieces extremely scarce. Most players get 20 of the 22 pieces easily, but never complete a set.

DNA Sequencing: When sequencing a genome with random fragments, this problem predicts coverage depth needed.

Hash Table Collision Analysis: In computer science, this models the birthday problem and hash function distribution.

Variance: It Gets Worse

The expected value is around n × ln(n), but there's significant variance. The standard deviation is approximately:

σ ≈ √(π²n²/6) ≈ 1.28n

For 50 coupons, the standard deviation is about 64 draws. So while the expected value is 225, you might need anywhere from 160 to 290 draws in typical cases—with occasional outliers needing 350+!

"The coupon collector problem teaches us that completing any collection follows a pattern: the beginning is easy, the middle feels like progress, and the end is pure torture."
— Mathematical folklore

The "Birthday Problem" Connection

The Coupon Collector is closely related to the Birthday Problem. While the Birthday Problem asks "how many people until a collision?", the Coupon Collector asks "how many draws until all possibilities seen?"

They're inverses: the Birthday Problem expects collisions quickly (around √n), while the Coupon Collector expects completion slowly (around n × ln n).