3Blue1Brown
June 12, 2026
TL;DR
Claude Shannon discovered that English text compressibility depends on predictability, developing experimental methods to measure character probabilities and estimate that English can theoretically be compressed to about 1 bit per character with sufficient context.
“His idea was that this new string of text had fewer actual letters, but it carried the same information, in the sense that it should give just the right prompting for a duplicate of his wife to fill in the entire text.”
— Narrator describing Shannon's experiment
“His final estimate was that given at least 100 characters of context, English should, at least in principle, be compressible down to around 1 bit per character.”
— Narrator
“Today, more than 75 years later, the way we actually achieve compression close to this limit is not through merely probing at intelligence, but by making our best attempts to engineer it.”
— Narrator
1. Shannon's Problem with Short Sequences
Shannon recognized that tracking statistics of character sequences (like what follows 'TH') breaks down for longer sequences that rarely or never appear in sample texts, yet longer sequences provide crucial context for accurate prediction.
2. The Betty Shannon Experiment
Shannon's informal test involved having his wife predict each letter in a passage, with correct guesses replaced by dashes and incorrect ones shown. This demonstrated that predicted letters carried redundant information, hinting at text compressibility.
3. Formal Probability Measurement
Shannon's 1950 paper outlined a formal experiment where multiple human subjects predicted letters, with the number of guesses required recorded to estimate the actual probability distributions people assigned to upcoming characters.
4. The 1 Bit Per Character Estimate
By combining statistics from short sequences with human guessing behavior, Shannon concluded that English with 100+ characters of context should theoretically compress to around 1 bit per character.
5. From Human Intelligence to Engineered Compression
Modern compression techniques achieve limits near Shannon's estimate not through human prediction but through engineered artificial intelligence, reflecting the broader principle that compression is intelligence.