The AND-Multiplication Error

Thomas D. Schneider, Ph.D.

The multiplication rule. It is well known from elementary probability theory that if two events are independent, then we may multiply the probabilities of each event to determine the probability of having both events occur. Suppose that there are two events A and B, with the probability of occurrence for A being P_A and the probability of occurrence for B being P_B. Further assume that the events neither influence each other nor do they both have a common source of influence. Then the probability that both events occur is P_A x P_B. That is, when the events A and B are independent, the probability that event A AND event B both occur is found by multiplying the probabilities of the individual events.

Example: A single die is a cube having 6 numbered faces, numbered by dots. The probability of getting a single dot from an unbiased die is 1 out of 6 or 1/6. The probability of getting two dice each to have one dot (snake eyes) is 1/6 multiplied by 1/6 or 1/36 = 0.028. Consider now a case where the two dice are glued together so that on one side there is a snake eyes. We toss the two and only read them if they don't end up on the end as a stack one on top of the other. Since there are 4 sides, the probability of 'snake eyes' is 1/4 = 0.25. The non-independence dramatically increased the probability of the event nine fold.

The multiplication rule does not apply to biological evolution. A common error in the non-scientific literature and poorly written papers is to assume that probabilities multiply for computing components of living things such as proteins. A typical argument notes that proteins are about 300 amino acids long and that there are 20 different kinds of amino acids. If such a string were to be generated using independent selection of the amino acids, then the probability of generating any particular string is 20^-300, a very small number indeed. While this may be true for random strings, it does not directly apply to proteins found in living organisms. Why? Because individual mutations accumulate one-at-a-time and there is amplification (replication) between steps. That is, if one starts with a given amino acid string, the mutations in the genome (from which the string is derived) are sequential. A mutation occurs, perhaps changing the amino acid string. If the change is bad, which is true for the majority of changes, the organism dies and its genes are gone. (In diploids, recessive defects will be removed more slowly since they are only exposed when an organism becomes homozygous for the mutation.) If a rare lucky change occurs that has some advantage (or at best is neutral or only slightly deleterious) then the organism may survive to produce offspring. The possibility of appearance and acceptance (by natural selection processes) of mutations in the offspring therefore depends strongly on whether the previous generation survived and on the number of progeny. Genetic algorithm experiments, such as the Ev evolution program demonstrate clearly that the probability of generating what would be an extremely rare genetic string if the steps were independent, can be high. So the evolution of a 300 amino acid protein is reasonably easy to attain.

A concrete example. Suppose we have 10 coins that land as 'heads' or 'tails' after they are all flipped at once in parallel. The probability of getting all heads is (1/2)¹⁰ = 1/1024. The probability of not getting all heads in a single parallel flip is 1-1/1024 so the probability of not getting all heads after F parallel flips is (1-1/1024)^F. After a number of flips F, the probability of finally getting all heads is

      1 - (1-(1/1024))^F.

For example, after 1024 tries the chance of getting all heads at least once is only 1 - (1023/1024)¹⁰²⁴ ≈ 63.2%. So it could take quite a while to get all heads!

But that is not what happens in nature. To model what happens in natural biological systems, consider flipping all 10 coins at once. Initially there will be about 5 heads and 5 tails. We paste these to an index card. We then make 100 copies of the card, including the states of the 10 coins. While we make the copies of the coin states, we sometimes make an error, changing a head to a tail or a tail to a head. We then find the card that has the most coins with heads up and we throw away all the other cards. So if even one card has an extra head, it will be found. We reproduce that card 100 times (with errors) and repeat the selection. Suppose that we make an error in copying a coin state about 1 time in 100. Then almost every other generation we will get another head. Starting from about 50% heads, it will only take 10 generations to get a card with all heads. That is what happens in nature. Notice that we have wasted a lot of cards, coins and glue to get the all-head card - about 1000 sets! - but the result comes quickly. (If the coin is a penny, the cost is $100.) Consider the dandelion. It creates many progeny. Many fall on the wrong ground or are perhaps eaten. But the ones that get through can repopulate your entire yard!

Summary. It is inappropriate to multiply probabilities unless the two events are independent. One must account for all of the events (in other words, honor the dead). The functional amino acids in a protein are not obtained independently since many organisms die for the few that survive to reproduce. Each change to an amino acid occurs in the context of the current protein and therefore depends on the previous history of the protein. Although the amino acids may be functionally independent (allowing, for example, the computation of a sequence logo), the appearance of the selected amino acids is sequential during evolution and is, therefore, dependent on previous steps. It is invalid to directly apply the multiplication rule to computing the probability that proteins came into existence.

Links

See the discussion in the Ev paper for a computer model of evolution and precise examples.
google: probability event multiply
snake eyes: mathcentral.uregina.ca and paos.colorado.edu/research/wavelets/montecarlo.html.
Ask Dr. Math: FAQ: Introduction to Probability

Documents that make the AND-multiplication error and use it to draw conclusions are flawed to the core and their conclusions can be immediately dismissed as invalid.

Below are examples of documents that make the AND-multiplication error, listed alphabetically by author.

Michael J. Behe
- Michael J. Behe, "Self-Organization and Irreducible Complexity: A Reply to Shanks and Joplin," Philosophy of Science 67 (2000): 155-162. On page 157-158:
  . . . no law of physics automatically rules out the chance origin of even the most intricate IC [irreducibly complex] system. As complexity increases, however, the odds become so abysmally low that we reject chance as an explanation.
  This quote comes from Behe, Biochemistry, and the Invisible Hand by Niall Shanks and Karl Joplin.
William Dembski
- William Dembski. No Free Lunch: Why Specified Complexity Cannot Be Purchased without Intelligence. Rowman and Littlefield Publishers, Lanham Maryland, 2002. On page 293:
  Randomly picking 250 proteins and having them all fall among those 500 therefore has probability (500/4,289)²⁵⁰, which has order of magnitude 10^-234 and falls considerably below the universal probability bound of 10^-150.
  This quote comes from 7.5 Darwinism Fails the Behe Test by Kurt Johmann (who by agreeing with this quote also has made the AND-multiplication error). The basic premise of the entire book is built upon this error, so it appears repeatedly.
Stephen C. Meyer
- The Origin of Biological Information and the Higher Taxonomic Categories Proceedings of the Biological Society of Washington, 117[2]:213-239, August 4, 2004. Republished online August 28, 2004.
See a detailed discussion of this case.

L. M. Spetner.

Spetner, L. M. (1964) Natural selection: an information-transmission mechanism for evolution. J. Theor. Biol., 7, 412-429.
Spetner, L. M. (1998) NOT BY CHANCE! Shattering the Modern Theory of Evolution, Judaica Press, New York. As with Dembski, Spetner's basic premise of the entire book is built upon this error, so it appears repeatedly. For example, Page 130:
For just a moment let's look at the chance of a species evolving into a new one if at each step there is only one potential copying error that can be adaptive. What we've found above is the chance of just one of the small steps occurring. To get a new species, 500 of them have to occur without any failures. As we shall soon see, for successful evolution the probability of each has to be very nearly one. The chance of 500 of these steps succeeding is 1/300,000 multiplied by itself 500 times. The odds against that happening are about 3.6×10^2,738 to one, or the chance of it happening is about 2.7×10^-2,739. That's a very small chance! It's more than 2,000 orders of magnitude smaller than the chance of the event I call impossible.
The only reason that Spetner found this impossible is that he made an inappropriate computation, one that is not relevant to the situation at hand. The same quote appears at Chance - Probability Alone Should End the Debate, www.WindowView.org. So it indeed does end the debate, Spetner and www.WindowView.org have made a fatal error.

See the Discussion in the Ev paper.

Lee Spetner responds (briefly) to Tom Schneider by William Dembski on November 10th, 2006.

Challenge (on 2006 Nov 15): I invite Dr. Spetner to compute the probabilty that 16 DNA binding sites of 4 bits each can evolve.

Solution (presumably): That's 16×4 = 64 bits, so the probability of it forming randomly should be 2^-64 = 1/18446744073709524992 = 5.4 x 10 ^-20 according to Dr. Spetner's method.

Question: How many generations would it take to have this appear? (One mutation per generation in a genome of 256 positions for the recognizer.)

Dr. Spetner's response:

2008 Jul 31: I reviewed Lee Spetner's objection to this web page. He claims that the points made on this page are incorrect. The specific computation Spetner uses takes the adaptive mutation to have probability p = 1/300,000, which is perhaps high for the mutation rate in bacteria (wikipedia gives 10^-8). His claim is that the probabilty of two of these occuring is p^2 and so on. What he neglects is that the first can get fixed in the population before the second mutation occurs. So the probability at each step is only p and the p values do not multiply. Because of selective advantage of the mutation, the mutation takes over the population. So the next probability is again p. Note that in one ml of saturated bacterial culture (grown overnight) there are typically 10⁸ bacteria, so there would be on the average p*10⁸ = 333 adaptive mutations per day. Even with p = 10^-8, this is plenty enough to do lots of nice bacterial genetics.

Royal Truman
- Protein families: chance or design: Royal Truman and Michael Heisig, TJ 16(3) 2001. (tinyurl)
- The Problem of Information for the Theory of Evolution: Has Tom Schneider Really Solved It? Royal Truman, 2001.

2006 Jul 19: Mark Hancock pointed out this statement, which used to be in the text above, is not right: "It would take about 1024 tries to get all heads." He said: "After 1024 tries, you'd actually only have a 1 - (1023/1024)^1024 ~= 63.2% chance of getting all heads (at least once) in the case that you flip 10 coins every time." Thanks for the correction!

color bar Small icon for Theory of Molecular Machines: physics,
chemistry, biology, molecular biology, evolutionary theory,
genetic engineering, sequence logos, information theory,
electrical engineering, thermodynamics, statistical
mechanics, hypersphere packing, gumball machines, Maxwell's
Daemon, limits of computers

Schneider Lab
origin: 2004 Sep 17
updated: 2011 Aug 16
color bar