As promised... Because I am professionally bored and frustrated and the 10,000 Monkey thing annoys me deeply (and is in fact related to the issue of life appearing "by chance" from molecules just randomly bouncing around some peripheral blue planet in a forgotten corner of outer space), I took the time to work out the related probability theory. (
I swear I actually taught a long class in the middle of working this out, so I'm not a total parasite!!) The combinatorics are sufficiently difficult that you cannot come up with exact probabilities, but you
can come up with upper and lower bounds.
I assume that we are working with an alphabet of size
A (e.g.
A=26) and time is measured in units of typewriter keypresses,
T. If the monkeys are slow,
T is measured in units of 1 second, but if they are employable as word processors then maybe
T is measured in units of 1/10 second. I am interested in the probability of producing, at some point in time, a text of size
L. Shakespeare's
Romeo and Juliet is about
L = 105,891 to 147,189 letters, depending on how exactly you measure. I assume that every character could be pressed with equal probability, and that sequential keypresses are independent. Neither of these assumptions is strictly true from purely physical considerations because of the arrangement of a standard QWERTY keyboard, but these assumptions actually favor higher probabilities, because it would be harder to break the autocorrelation to produce a specific work that was inconsistent with the arrangement of QUERY letters, and some of the necessary keys may also be slightly harder to access (e.g. "x" in "
exeunt").
The lower bound is acheived by chopping up the entire timeline into
T/
L intervals, each of length
L. The probability of typing out the desired text in any one of those little intervals is 1/
AL, so the probability of not typing out that text (in that one inverval) is 1-1/
AL. The probability of the monkey missing his chance in every one of the
T/
L intervals is (1-1/
AL)
(T/L). This quantity overestimates the overall probability of failure because it ignores the opportunity to type the first portion of the text in one interval and the remainder of the text in the next interval. Thus, it is an upper bound on the failure probability, and therefore 1 - (1-1/
AL)
(T/L) is a lower bound on the success probability.
L'Hopital's rule (break out your old calculus textbook) demonstrates that this probability converges to 1 as
T increases to infinity (and
A and
L remain fixed), which is the original inspiration for the 10,000 monkey meme: indeed, if we wait long enough, even one monkey will produce Shakespeare,
with certainty.
But how long would we have to wait in the real world? First, let's calculate the upper probability bound. This is achieved by observing that the monkey could start the exact text on the first
L keypresses, and then could type whatever he wants on the remaining
T-
L keypresses, for a total of
A(T-L) remaining random possibilities! Or he could start on the second keypress, with still the same number of random possibilities [since there are still
T-
L free key presses, the first, then the (
L+2)
nd, (
L+3)
rd, etc.] Or he could start with the third keypress, etc. There are
T-
L+1 possible starting points, but if he doesn't get it right before keypress number
T-
L+2, there's no way he can finish the text in sufficient time. Thus the probability is at most (
T-
L+1)
A(T-L)/
AT. Why "at most"? Well because we're double counting some of the same exact sequences. Adding up the
T-
L+1 possible starting points is the same thing as taking the union of
T-
L+1 different sets, and by adding probabilities we are ignoring the overlaps in the sets. The probability of overlap depends on the exact nature of the sequence of the desired text, which is why the combinatorics are so difficult. No matter, the upper bound is sufficient for our purposes, because we can see what the
maximum probability is for a fixed
A,
L, and
T:
P <= (
T-
L+1)
A(T-L)/
AT = (
T-
L+1)
A-Llog
10(
P) <= log
10(
T-
L+1)-
L*log
10(
A),
where
P is the probability of one monkey producing the desired text in time
T.
I
downloaded Romeo and Juliet and found 147,189 characters, 64 of them unique (upper and lower case letters, plus punctuation). Let's consider going easy on the the monkey and give him an immortal supply of bananas if he can get the play right without punctuation and ignoring upper and lower case distinctions. This reduces the problem to a text of size 105,891 on 26 letters. The universe is about
13.7 x 109 years old, i.e. 4.3 x 10
17 seconds, or 4.3 x 10
18 typewriter keypresses (assuming a monkey who is competitive in the word processing labor market). For the "hard" problem (perfect punctuation and case), log
10(
P) <= -265,831 in one cosmic lifetime, i.e.
P <= 10
-265831. For the "easy" problem (no punctuation or case distinction), log
10(
P) <= -149,814. At this small scale, an ensemble of 10,000 monkeys merely multiplies the single-monkey probability by 10
4, so log
10(
P10000) <= -265827 for the "hard" problem and log
10(
P10000) <= -149,810 for the "easy" problem.
Let's turn the problem around and ask how long we would need to wait. With some algebra,
T >=
P10000 AL +
L - 1 >=
P10000 AL,
so that log
10(
T) >= log
10(
P10000) +
L*log
10(
A)
105,891 * log
10(26) = 149,833, so to get even a 1 in 100 chance, we would have to wait 10
149831 keypresses, or 10
149824 years, or over 10
149813 cosmic lifetimes to get 10,000 monkeys to type out a simplified version of
Romeo and Juliet.
Thems not very good odds. Let's not torture 10,000 monkeys in the dim hopes of proving a profoundly stupid point.