I received a wonderful email a few weeks ago about the Shakespeare Million Monkeys problem.  This problem is actually a statistical challenge embraced by the computing community, and solved with advanced computing techniques like parallelism.  The problem is described well in this Wikipedia article:

Infinite Monkey Theorem (Wikipedia),
Weasel Program, a similar concept (Wikipedia)

There is also some additional information on the results of various attempts to automate the solution. They are quite fascinating and worth reading for any software developer.

A few million monkeys randomly re-create the works of Shakespeare (Jessie Anderson–uses Hadoop)

Solving The Shakespeare Million Monkeys Problem In Realtime With Parallelism And SignalR

There is even a Google open-source project for this: http://code.google.com/p/million-monkeys-project/

Basically, the theory says that if an infinite number of monkeys type an infinite number of keys on keyboards, they will eventually write Shakespeare.  The odds are astronomically small, but still possible.  It is an extension of the Weasel program, which uses a similar concept with the simple goal of generating only one phrase from a Shakespeare novel: “Methinks it is like a weasel”

My focus here is not on the technical nature of the problem, but what is revealed about human thinking by the decisions made in the various software applications written to test the theory. Why?  Because somehow the theme of a pure statistical probability is altered as the application developers impose what they interpret as the basic rules of evolution to guide the monkeys.. at least, in code.

All of the examples I have seen for these applications start with creating one or more monkeys who type keys randomly.  This follows the theorem, as long as we don’t ask who made the typewriter and left it there.  (Yes, yes, it’s the software guys.. I know).

So these monkeys who have been randomly typing away on their keyboards, reproduce other monkeys who also type, and the cycle endlessly repeats to create massive amounts of monkeys typing random keys.  This also follows the guidelines of the theorem and evolution, and evolution implies that the monkeys reproduced (the next generation) also inherit traits from the parents, and even some mutations can occur.  In software, it can be as simple as giving each random number generator (i.e. each monkey) a different seed, however slightly altered from the original.  If you write an algorithm to randomly mate two monkeys, a hybrid seed could be calculated from the mating, with the occasional unexpected seed value to represent a mutation.

So at this point, the monkeys are staying within the boundaries of evolution.  If you grant them an unlimited lifespan, they can continue to type forever alongside the multitude of generations of their offspring, allowing for a great variety of comparison among the generations to see if certain mutations or genetic lines are increasing the quality of work toward the desired Shakespearian result.

And that would be a great test of the theorem.  But whether it is motivated by lack of resources (or a lack of patience) to let the virtual monkeys continue on for a potentially infinite amount of time, or a desire to test how the result can be driven, some unnatural algorithm to enforce a form of selection or filtering is applied.  The algorithm is designed to either prevent the reproduction of those monkeys not producing desired patterns, or blatantly kill them off.  The end goal of this reduction is to improve the output of the remaining monkeys, so that progressive generations have a better chance to produce outputs closer and closer to the desired result.

And how is the decision made to determine which monkeys are allowed to reproduce (or even live)?  If the spirit of the statistical theorem were being  followed, we would have to choose numerous random ways of determining the monkeys to bless or curse with a virtual fate.  We could write an algorithm to simulate disease and accidents, with both leading to injuries and fatalities among the monkeys–even accounting for mutations/adaptions resulting from the accidents among the survivors.  These mutations themselves could also be represented as beneficial or detrimental.

To keep the code from becoming overly complex, we would disregard things like geographic relocation and separation of groups of monkeys, sporadic reunions of individual monkeys within groups after time has passed (intercultural interaction), and other things that living creatures experience that mold their thinking.  The random occurrences of life or death due to random circumstance would be the sole cardinal rule to prevent prejudicial influence on the reproduction of the monkeys.  A storm could kill a monkey with no talent, or the next monkey Einstein (or, more topical to the discussion, the next monkey Ralph Waldo Emerson).  No one could know who would be affected.

But the filter chosen for the selection process is anything but random.  It does follow one rule of evolution, in that there is a penalty/reward altering the behavior of the next generation of monkeys.  The penalty is the monkey dies, or isn’t allowed to reproduce.  The reward is the monkey creates offspring, which inherits its traits.  But the ultimate determining factor used for determining this penalty or reward scenario is quite amusing: How close is your output to Shakespeare?

If the pure random output was simply tested against the Shakespearian phrase or work, the theorem is being tested.  But if a decision is made on how (or whether) a new monkey is created, demoted or promoted based on that test result, the end result is being prejudiced. Ironically, that type of filtering is actually creating an answer, instead of an answer naturally evolving.

To better understand this statement, imagine if the monkeys type randomly through the generations and, despite random storms, diseases, wars, personal tragedies, criminal acts, etc, one monkey in the vast array of generations reproduces a work of Shakespeare. That is truly astonishing.  Shakespeare was only a measuring stick at interim points in the generations and at the end of the test: it had no influence on the monkeys. A monkey producing a Shakespearian output in this scenario was due to pure chance, and is de facto (i.e. concerning fact).  It evolved despite all the circumstances thrown against it.  De Facto is the expected result of the theorem.

When the works of Shakespeare are used to test the output, for the purpose of altering future generations in some way to direct them to better Shakespeare-like results, the method of selection becomes de jure (i.e. concerning law).   It is in this scenario that Shakespeare is a million monkeys’ uncle: his works are reflected in his virtual offspring, because he is enforcing his standard at every stage of reproductive events.

One thing that is important to keep in mind: in both scenarios, the monkeys are not aware of why they are being killed off or not reproducing. They just keep moving on, typing regardless of the circumstance.  The question is whether the great Shakespeare God  (in the virtual sense) is intervening in their destiny, or if that Shakespeare output is truly the logical conclusion of random circumstance.

I guess that is up to the monkeys to decide for themselves…  and that would be the really interesting algorithm.

Follow-up: I received feedback from several people that the exercise of the theorem is to see if an application can “teach” the monkeys how to progressively get better at writing Shakespeare, until they are writing actual Shakespeare.  And a baseline of a Shakespearean final product is needed to do so. I am aware of that.

My commentary here is to point out the “thinking”, so maybe I should point out mine.  It is a dangerous thing to mix the terms evolution, monkeys and math together.  It introduces the risk of prejudicing a reader’s concept with their preconceived ideas of evolution and monkeys into your theorem.  Personally, I question the motive of choosing the word monkey as the symbol, and the word evolution as the process, therefore implying that the math-based theorem is somehow trying to prove evolution.  Is it a wrong perception.. of course, but it takes reading the whole article in detail (requiring a technical background to understand the process) to realize that it is not a true evolutionary methodology.

This is where sticking with weasels, or even using dogs or goldfish, works better for the theorem.  Each of these are just being taught tricks to produce a desired outcome, and no inference to (de facto) evolution is implied.