Rebuttal to William A. Dembski's Posting and to His Book "No Free Lunch"

Thomas D. Schneider

Rebuttal to William A. Dembski's Posting

2001 June 6.

William A. Dembski claims (metanexus june 5, 2001) (another link: http://www.arn.org/docs/dembski/wd_americasobsession.htm) that the ev program does not demonstrate an information increase. On this page I will correct several errors in his posting. (Note: a critical test of one of his claims is given on another page.)

As an example of smuggling in complex specified information that is purported to be generated for free, consider the work of Thomas Schneider.
The ev paper did not make this claim since the phrase "complex specified information" was not used. It is unclear what this means. Shannon used the term "information" in a precise mathematical sense and that is what I use. I will assume that the extra words "complex specified" are jargon that can be dispensed with. Indeed, William A. Dembski assumes that information is specified complexity, so the term is redundant and can be removed.
Schneider heads a laboratory of experimental and computational biology at the National Cancer Institute. He is well-versed in Shannon's theory of information, regularly applies it in his research, and devotes considerable space to it on his website. (2)
The statement is slightly unclear. I am not the head of Molecular Information Theory.
Starting with an evolutionary algorithm acting on a randomly chosen sequence from the phase space, Schneider then purported to generate an information-rich sequence corresponding to a finely tuned genetic control system in which one part of the genome codes for proteins that precisely bind to another part of the genome.
It is more precise to say that the information was gained by the set of binding sites. Note the analysis in the paper that shows that the total genomic uncertainty does not change much during the evolution.
Schneider thinks that he has generated complex specified information for free, or as he puts it, "from scratch."
This statement represents a fundamental misunderstanding of the paper. The phrase 'for free' does not appear in the paper. The claim in the ev paper is that the information appears under replication, mutation and selection, commonly known as 'evolution'. It is not for free! Half of the population DIES every generation! In the standard example given in the paper, to gain 4 bits required the (virtual) deaths of some 32 organisms x 704 generations = 22528 deaths. On average that's 22528/4 = 5632 deaths per bit. Note that theoretically one could get 1 bit of information with only 1 binary decision. So the evolution is, not surprisingly, a rather inefficient information generating mechanism. No biologist has ever claimed any differently!

Note that "from scratch" does not mean the same thing as "for free". "From scratch" refers (obviously) to the initial condition of the genome which is random in this case so that Rsequence = 0 bits. That is, there is no measurable information in the binding sites at the beginning of the simulation. "For free" would mean "without effort", and the paragraph above demonstrates that there is quite a bit of effort and (virtual) pain for the gains observed.

NOTE ALSO that Dembski has attempted to put words into my head. I did not use the term "complex specificed information" in my paper. This is Dembski's jargon, unique to him. No scientist or engineer uses it. Furthermore, the statement implies that I made the information. That's wrong, it was generated by the process of the program. A careful scholar would avoid incorrect attributions.
Schneider claims to have generated complex specified information for free. The No Free Lunch theorems, however, tell us this is not possible.
2003 Jan 1 (revised response). Dembski made a strong claim here. While I am not an expert on these theorems, I understand from other scientists that
1. The theorems were proven by Wolpert and Macready at the Santa Fe Institute. See: http://citeseer.nj.nec.com/wolpert96no.html. The reference is Wolpert, David H and William G. Macready. 1997. The No Free Lunch Theorems For Optimization. IEEE Trans.Evol. Comp. v.1, no 1, 67-82. The theorems are about how well algorithms do relative to one another, not about whether an algorithm works.
2. The theorems were not tied to the evolutionary problem by Dembski. See: William Dembski's treatment of the No Free Lunch theorems is written in jello, By David Wolpert
3. The theorems do not contradict the ev model.
The No Free Lunch theorems are not relevant to the problem, so Dembski is using misdirection. Indeed this is obvious from inspection of the ev program and its results: it works as claimed. A careful worker would not make this mistake because they would take the time to understand the theorem before citing it.
Where, then, has he smuggled in complex specified information? The precise place where he smuggles it in is not hard to find if one knows what to look for. Here is the crucial paragraph in his article: "The organisms [i.e., the computational sequences in phase space] are subjected to rounds of selection and mutation. First, the number of mistakes made by each organism in the population is determined. Then the half of the population making the least mistakes is allowed to replicate by having their genomes replace ('kill') the ones making more mistakes. (To preserve diversity, no replacement takes place if they are equal.) At every generation, each organism is subjected to one random point mutation in which the original base is obtained one-quarter of the time."(8) Within this crucial paragraph, the crucial sentence is: "The number of mistakes made by each organism in the population is determined." Who or what determines the number of mistakes? Clearly, Schneider had to program any such determination of number of mistakes into his simulation.
This is not unreasonable because it happens the same way in nature. For example, if a bacterium has severe mutations in 5 ribosome binding sites, then that means that 5 proteins will not be made. Is this fatal? Not necessarily. Suppose that the 5 proteins code for processing 5 different sugars. If the sugars are not in the medium that the bacterium swims in then it will make no difference. But when the bacterium comes to a solution where one of the sugars is available, it will be unable to eat. If that is the sole carbon source, it will starve (surely a 'mistake'!) and bacteria that have mutations that correct a site or already have a correct site will survive. A simple way to account for this is to count the number of mistakes. It may be that a highly beaten up genome (with 100 mistakes) is pretty much as badly off as one with only a few mutations, so maybe one should take the logarithm of the number of mistakes. But a logarithm is a monotonic function of its argument, so this will not change the selection order and therefore would not affect the evolution (other than wasting computer cycles). Surely it is not reasonable to say that a creature with 5 mistakes will survive better than one with 2 so to match the natural situation we should pick a monotonic function. That's what I did in the paper.

So the answer to "Who or what determines the number of mistakes?" is: Just as in nature, the number of genetic control systems that if controlled would give an advantage determines the number of mistakes.

In the ev program the number of binding sites is determined by the user, but this is irrelevant since it is a free parameter---i.e. the user may explore any value---and similar results are obtained with various numbers of binding sites. In nature the number of required binding sites is determined by the history and current physiology of a organism. This was discussed in both ev and schneider1986.

In addition this, Wesley R. Elsberry points out that the environment is important. Specifically, the environment determines the opportunities for physiology. For example, if a bacterium is frequently in an environment that has a sugar that it cannot metabolize, then duplication of the metabolic genes from a similar sugar and drift of the protein recognition will lead to the introduction of another ribosome binding site and associated control systems.
Moreover, the determination of number of mistakes is the key defining feature of his fitness function, for which optimal fitness corresponds to minimal number of mistakes.
I generally do not find 'fitness' to be a useful concept. In the ev program there is no fitness function and the word 'fitness' does not appear in the paper. Unlike most biologists I dispense with the concept of a fixed 'fitness function'. A 'fitness landscape' is too rigid since it does not describe the effects the organism itself may have and it does not account for a changing environment (In addition, fitness is generally depicted as 2 dimensional, which causes severe conceptual problems, see ccmm). At best there is only 'relative instantaneous fitness' in a changing environment. That is, whoever makes the fewest mistakes in the current environment is likely to survive.
Schneider's choice of fitness function is the most obvious place where he smuggles in complex specified information.
Counting of the number of mistakes matches what happens in nature, as described above. I only claim that the ev simulation matches what happens in nature in essential points. No smuggling occurs. If Dembski finds that this produces information, then he will understand that the simulation shows that information can be generated in nature solely by replication, mutation and selection. That is information as mathematically defined by Claude Shannon can be generated by Darwinian evolution.
But there are others. In the _Nucleic Acids Research_ article we've been discussing, he does not list the source code for the program underlying his simulation. For that code he refers readers to the relevant web address. The source code is revealing and shows that Schneider had to do a lot of fine-tuning to his evolutionary algorithm to make his simulation come out right.
That is quite incorrect. After finishing the writing of my thesis in the spring of 1984 (see Schneider1986) I had a spare week before my thesis defense. I had already determined how to write the evolution program, and I decided to do it as a test of my thesis. I wrote the program in the space of 2 or 3 days. I did not fine-tuning it and yet it gave the result that Rsequence approached Rfrequency. (Tweeking it to work would have invalidated my goal, which was to test my thesis!) At the end of the week I walked into my thesis defense knowing that I could simulate my main thesis using a computer! In any case, I never mentioned that during the defense ... If Dembski will look more closely at the code he will see that it is constructed quite cleanly with lots of documentation (55% of the code characters in ev are in comments). The main loop is:
```
   for c := 1 to e.cycles do begin
      culture(list,e);
      if (c mod e.storagefrequency) = 0 then putout(e, all);
   end;
```
where 'e' is a compund variable that contains all other variables ("everything") and putout just saves the current state every once in a while. The culture routine is:
```
   e.generation := e.generation + 1;
   reproduce(e);
   mutate(e);
   order(list, e)
```
Note that:
- 'reproduce' replicates the organisms with fewer mistakes into the places of the ones with more mistakes and does nothing else.
- 'mutate' makes changes randomly over the genome without regard to where the hits are and it does nothing else.
- 'order' translates the weight matrix and scans the genome of each organism to determine the number of mistakes. It then sorts the organisms by their number of mistakes, and does nothing else besides data recording into the list file.
There is no place to fit in special 'fine-tuning'!
For instance, in the crucial paragraph from his article that I quoted above, Schneider remarks parenthetically: "To preserve diversity [of organisms], no replacement takes place if [the number of mistakes is] equal." Schneider's Pascal source code reveals why: "SPECIAL RULE: if the bugs have the same number of mistakes, reproduction (by replacement) does not take place. This ensures that the quicksort algorithm does not affect who takes over the population. [1988 October 26] Without this, the population quickly is taken over and evolution is extremely slow!"(9) Schneider is here fine-tuning his evolutionary algorithm to obtain the results he wants. All such fine-tuning amounts to investigator interference smuggling in complex specified information.
This is a testable claim, and the test and results are given on another page. The results show clearly that removing the SPECAL RULE has no effect on the gain of information by evolution so William A. Dembski's claim is incorrect.

2001 June 7. Wesley R. Elsberry has a very extensive page on William A. Dembski. (2002 Jun 20. The old page moved.)

Rebuttal to "No Free Lunch"

2002 January 23. Dembski has published a book:

No Free Lunch: Why Specified Complexity Cannot Be Purchased without Intelligence, By William A. Dembski Rowman & Littlefield Publishers, Inc., $35.00, Cloth 0-7425-1297-5, November 2001, 432pp, amazon.com

In Section 4.9, "Following the Information Trial", on pages 212-218, Dembski attempts to deal not only with the Ev program, but also with my rebuttal on this page.

Dembski Responded to This Rebuttal (which you can read starting at the top of this page) by quoting parts of it, (on the bottom of page 214 he quotes from this page) so he had an opportunity before the book went to press to correct any mistakes, to resolve any misunderstandings and to admit any errors. Much of the phrasing in the book is identical to his original statements. Finally, the rebuttal was started on 2001 June 7 and substantially finished on June 8, while the claim test was written 2001 June 6 and June 7th. Dembski's recorded access to the web page in No Free Lunch (see page 235 item 71) was "10 June 2001", so HE HAD ACCESS.
Dembski Misunderstood the No Free Lunch Theorems. As noted above Dembski misunderstood the No Free Lunch theorems. Surprisingly, he has not corrected himself. (But then, the entire book would probably evaporate, so he couldn't do that could he?) On page 212 he says: "The No Free Lunch theorems show that evolutionary algorithms, apart from careful fine-tuning by a programmer, are no better than blind search and thus no better than pure chance." This is clearly wrong, given the results of the Ev program, since those would have take much longer to give results if it were merely randomizing without selection. Furthermore, the results would not have been stable (the information curve goes up and stays there). When one uses a theorem, one must understand it correctly and one must apply it correctly. If one does not do this, everything that follows may be nonsense, and one is operating below Level 4 of Dudley's Criteria.
CSI is ill defined. Page 215: "The issue is whether in the currency of information, and CSI in particular, Darwinian evolution incurs a cost." CSI is not used by anyone except Dembski and as far as I know is undefined. A precise definition is needed before Dembski can claim anything. I will assume that he means Shannon information as defined in my paper. There is no reason to add extra notation, so saying that the information is "complex specified" does not add anything. I have not had a chance to read all of "No Free Lunch", but to be taken seriously, Dembski must do one of the following:
- Publically abandon the term "Complex Specified Information".
  OR
- Publish a precise definition along with a clear explanation of how CSI differs from standard measures already in the literature. Such a definition would show exactly how to compute CSI given (for example) DNA sequences of protein binding sites.
If he does not do one of these things, I see no reason why anyone should not dismiss his works out-of-hand from now on. I think that he should abandon "CSI".
- 2002 March 9: see Dissecting Dembski's "Complex Specified Information".
- 2003 November 12: Marcel Popescu mdpopescu@yahoo.com, http://www.artelecom.net/mdpopescu/. responded to this point in an undated diatribe (in the sense of abusive):
  As for Schneider, he lies very blatantly on his page, in his "CSI is ill-defined" paragraph. CSI is rigurously defined by Dembski - I don't have the book with me now, but I will bring it tomorrow, and give you the page number for those interested in checking it. As I said above, CSI is specified information beyond the universal probability bound - that is, specified information more than 500 bits long.
  There is a huge difference between lying and recognizing a bad definition. These are still problems with CSI:
  - The number of bits cited, 500, is irrelevant because the computation is for a random string (as described by Popescu) instead of a selected string. This is the classic error of creationists, repeated ad nausium: It is not appropriate to multiply the probabilities when the events are not independent. So this number of bits is irrelevant to the problem of evolution.
  - No clear explanation of how this supposed measure differs from Shannon has been given, as I requested above.
  So the problem lies both with the 500 bits - which is irrelevant to the problem at hand - and also more fundamentally in the meaning of CSI. It is not intrinsically different from "information" except for an arbitrary and irrelevant bound (500 bits).
Dembski Makes Incorrect Statements About Other People's Positions. Continuing on page 215: "It does not, and Schneider agrees that it does not." Dembski put words into my mouth. (This is a common creationist tactic.) There is an extreme cost to gain the information in living things - the death of thousands of organisms. In other words, selection is highly costly. BUT at the same time, selection is the process by which the remaining organisms have more information than the starting population. I stated this clearly higher up on this page but instead of dealing with it, Dembski dismissed it as "semantic hair-splitting".
Dembski's Tautology Objection Fails. Page 216: Dembski points out my statement (from higher up on this page) that "At best there is only 'relative instantaneous fitness' in a changing environment. That is, whoever makes the fewest mistakes in the current environment is likely to survive." He then claims that "This last statment is a tautology. It says that the survivors are the fittest (according to some apparently inexpressible notion of "relative fitness") and that the fittest are the survivors." First, it is not inexpressible. It is clear that in the Ev program an organism from late in the evolution would invariably win if placed into an early population. So what counts is who makes more mistakes (a relative comparison), not an absolute number of mistakes. I am willing to accept a concept of fitness, as stated above, which is changing over time. The point is that fitness is not a simple function of the environment because the organism changes the environment. So hill climbing on a fixed landscape is not a good model of what happens in nature. The interpretation that there is a tautology is wrong. First, there is no 'fittest', only a number of mistakes made by an organism in the program. This is simple to compute and the program does that. In the program FIRST the number of mistakes each organism is determined. THEN they are compared in a separate step. The ones that survive are those with the fewest mistakes. In the code of the program is a procedure called 'order'. It contains:
```
procedure order(var list: text; var e: everything);
(* order the bugs: evaluate, sort and display *)
var
   bug: position; (* index to the rank array *)
begin
   if e.selecting
   then for bug := 1 to e.p.bugs do evaluate(e,bug)
   else for bug := 1 to e.p.bugs do randomize(e,bug);
   quicksort(1,e.p.bugs);
   if (e.generation mod e.displayinterval = 0)
   then display(list, e);
end;
(* end module ev.order version = 2.50; (@ of ev 1988 oct 6 *)
```
Thus if there is selecting, then the bugs are evaluated. This is followed by a call to quicksort to sort them by the number of mistakes. (The actual killing is done in procedure reproduce, where the bugs in the first half are duplicated and copied into the 'stalls' of the bugs in the second half. Of course, this is only a selection if the bugs were sorted by their mistakes.) There are two distinct steps and there is no tautology. Note that the first step can be a random evaluation if the user chooses to turn selection off. This objection is merely a diversion by Dembski who is evidently grasping at straws (and making straw men!) to avoid concluding that the program really does evolve binding sites exactly in parallel with what happens in nature.
Environment makes selections. Page 216: Dembski asks who or what counts the mistakes. The answer is simple: the environment in interaction with the organism. For example, in front of an open reading frame there must be a ribosome binding site or the open reading frame will not be made into a protein. This was clearly stated in the original Ev paper. Apparently Dembski does not understand the meaning of R_frequency. Briefly, there are a certain number of genes which, if under appropriate control, would give a selective advantage to an organism. This number is approxmiately fixed by history and current genetics. If an organism has the controls, it will do better than one that does not have controls. The latter is more likely to die. This probabilistic situation occurs in the ev program. So the answer is that 'the environment' counts the mistakes. For example,
- if a researcher invents new terminology without giving reasonable arguments for it,
- makes blatant mistakes in his arguments,
- if he does not correct errors when they are pointed out to him,
- and especially if he avoids dealing with tests of his claims,
then in the long run he will be dismissed by the majority of scientists. The environment makes the selection.
Two Parts. Page 217: "The second part of Schneider's response is therefore to admit that the counting of mistakes does occur after all ... This is a very curious statement, given that the initial idea in the Ev program was to count mistakes! Dembski has created a "strawman" image of my arguments and then proceeds to (attempt) to knock them down. Again, the only way this will succeed is that readers who have not taken time to read my original paper will be fooled. Using the ignorance of the reader is a typical creationist ploy. The 'two part' argument that Dembski complains about only reflects his own misunderstanding.
Chock-full Page 217: "Yet if the counting of mistakes matches what happens in nature in essential points, the the obvious conclusion is that nature is chock-full of design and that replication, mutation, and selection are merely instruments for expressing that design." This is another misunderstanding by Dembski. (They come so fast and thick here that I am only pointing out the highlights. The reader can find plenty more!) The three processes of replication, mutation, and selection are not expressing 'design' in nature. A cliff is not a 'design' in the sense of being formed by an intelligent being (as Dembski would have us think) it forms by well understood geological processes. Yet the cliff will select against animals that cannot see and for those with better eyesight. It is more accurate to say that the evolutionary processes reflect the environment.
TEST OF CLAIMS IGNORED! The statement directly above, on page 217, is a lead-in to Dembski's objection to the SPECIAL RULE. INCREDIBLY Dembski failed to respond to my test of his claim, the results of which completely demolished his argument! On page 217 of the book he ran the same argument. Perhaps he didn't have time to get it into the book? I initiated this web page on 2001 June 6. The book just came out and the publication date was November 2001. Was there enough time? Yes. Dembski quoted other items on this web page, so he had full access to the arguments. There is no reason to avoid the critical issue. Dembski avoided discussing a careful study which showed that his arguments are false. This tactic will only work if the reader is unaware that Dembski is avoiding arguments that demolish his position.

Summary of No Free Lunch. I have reviewed a small portion of No Free Lunch here, but it was full of not only errors but also many poor arguments. At amazon.com an unnamed (!) reviewer "A reader from New Mexico" asked that the scientific community respond to this book. This is a response. Not surprisingly, Dembski's arguments fail time and time again.

Acknowledgments

Thanks to Ilya Lyakhov and Wesley R. Elsberry for many useful comments.

2002 March 10. I completed reading No Free Lunch. Dembski attempts to define complex specified information. His arguments are messy (repeatedly varying from silly examples to overdone math) but the essentials seem clear enough to deal with. I therefore asked whether the Ev program creates CSI and was led to conclude that it does. (This in no way implies that I condone CSI. I think it is a mistake and that Shannon was a master to avoid this mistake.) The logic is given on Dissecting Dembski's "Complex Specified Information".

2002 April 25. Not a Free Lunch But a Box of Chocolates: A critique of William Dembski's book No Free Lunch by Richard Wein.

"The standard of scholarship is abysmally low, and the book is best regarded as pseudoscientific rhetoric aimed at an unwary public which may mistake Dembski's mathematical mumbo jumbo for academic erudition."

2005 June 16. Critique of "Irreducible Complexity Revisited"

2011 Feb 16. Dissection of "A Vivisection of the ev Computer Organism: Identifying Sources of Active Information"

color bar Small icon for Theory of Molecular Machines: physics,
chemistry, biology, molecular biology, evolutionary theory,
genetic engineering, sequence logos, information theory,
electrical engineering, thermodynamics, statistical
mechanics, hypersphere packing, gumball machines, Maxwell's
Daemon, limits of computers

Schneider Lab
origin: 2001 June 6
updated: version = 2.25 of dembskirebuttal.html 2016 Feb 29
color bar