The results, which show
the successful simulation of binding site evolution,
can be used to
address both scientific and pedagogical issues.
Rsequence approaches and remains around
Rfrequency(Fig. 2b),
supporting the hypothesis that the information content
at binding sites will evolve to be close to
the information needed to locate those binding sites in the genome,
as observed in natural systems [4,6].
That is,
one can measure information in genetic systems,
the amount observed can be predicted, and the
amount measured evolves to the amount predicted.
This is useful because
when this prediction is not
met [4,28,29,6] the anomaly implies the existence of new
biological phenomena.
Simulations to model such anomalies have not been attempted yet.
Variations of the program could be used to investigate how
population size,
genome length,
number of sites,
size of recognition regions,
mutation rate,
selective pressure,
overlapping sites
and other factors
affect the evolution.
Another use of the program may include
understanding the sources and effects of skewed genomic composition
[4,7,30,31].
However,
this could be caused by mutation rates,
and/or it could be the result of some kind(s)
of evolutionary pressure that we don't understand,
so
how one implements the skew may well affect
or bias the results.
The ev
model quantitatively addresses the question
of how life gains information,
a valid issue recently
raised by creationists
[32]
(Truman, R. (1999),
http://www.trueorigin.org/dawkinfo.htm)
but only qualitatively addressed by biologists
[33].
The mathematical form of
uncertainty and
entropy (
,
)
implies that
neither
can be negative (),
but a decrease in uncertainty or entropy
can correspond to information gain, as measured here by
Rsequenceand
Rfrequency.
The ev model shows explicitly how this information gain
comes about from mutation and selection,
without any other external influence,
thereby completely answering the creationists.
The ev model can also be used to succinctly address two
other creationist arguments.
First,
the recognizer gene and its binding sites co-evolve,
so they become dependent on each other
and
destructive mutations in either immediately lead to elimination of the
organism.
This situation fits Behe's [34]
definition of `irreducible complexity' exactly
(``a single system composed of several
well-matched, interacting parts that contribute
to the basic function, wherein the removal
of any one of the parts causes the system to effectively cease
functioning'', page 39),
yet
the molecular evolution
of this `Roman arch'
is straightforward and rapid,
in direct contradiction to his thesis.
Second,
the probability of finding 16 sites averaging 4 bits each in random sequences
is
yet the sites evolved
from random sequences
in only 103 generations,
at an average rate of 1 bit per 11 generations.
Because the mutation rate of HIV is only 10 times slower,
it could evolve a 4 bit site in 100 generations,
about 9 months [35],
but it could be much faster because the enormous
titer (1010 new virions/day/person [17])
provides a larger pool for successful changes.
Likewise, at this rate, roughly an entire human genome
of
bits
(assuming an average of 1 bit/base, which is clearly an overestimate)
could evolve in a billion years, even without the advantages
of large environmentally diverse worldwide populations,
sexual recombination and interspecies genetic transfer.
However, since this rate is unlikely to be maintained for
eukaryotes, these factors are undoubtedly important in accounting
for human evolution.
So,
contrary to probabilistic arguments by Spetner [36,32],
the ev program
also clearly demonstrates that biological information,
measured in the strict Shannon sense,
can rapidly appear in genetic control systems subjected
to replication, mutation and selection [33].