-
Blue
or
Cyan
boxes mark a gene.
Why do you think we chose blue?
It's a terrible pun!
-
If a creature makes fewer mistakes than all the others, will it survive?
Yes, because it will be sorted into the half that is not killed.
-
How many bacteria are on a normal human?
It has been calculated that the normal human houses about 1012
bacteria on the skin, 1010 in the mouth, and 1014
in the gastrointestinal tract. The latter number is far in excess of the
number of eukaryotic cells in all organs which comprise the human host.
---
The Bacterial Flora of Humans
by Kenneth Todar, 2002.
(Note: The link was broken as of 2011 Aug 17.
If the above link is still broken, you can search for it:
The Bacterial Flora of Humans by Kenneth Todar, 2002)
-
How quickly would a creature that makes one less mistake than the
others take over a population of 16 creatures?
Assuming that the creature continues to make fewer errors
than the others (that is, assuming it is not wrecked by
mutations):
- In the first generation it will be in the top 8.
- In the second generation it will be in the top 4.
- In the third generation it will be in the top 2.
- In the fourth generation it will be the top creature.
The number of times you need to cut a number in half to get down
to one is called
the logarithm to the base 2 of the number.
In this case,
log 2 16 = 4.
-
Where do the numbers in the weight matrix gene come from?
The numbers are a 'translation' of the piece of the genome
just above the colored box.
In the
standard example,
the weight 2T
(covering coordinates 56 to 60)
has a value of +510.
This number comes from the sequence
C
T
T
T
G.
How is the sequence translated into the number?
The first step is to make rules for converting the DNA
sequence into a binary string. The rules used in Ev are:
So the sequence is translated like this:
We treat this like a binary number, so:
place value: |
"sign" |
256 |
128 |
64 |
32 |
16 |
8 |
4 |
2 |
1 |
binary: |
0 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
1 |
0 |
To express this binary number as a decimal number,
we add the place values that correspond to the binary ones:
256 +
128 +
64 +
32 +
16 +
8 +
4 +
2
=
+510.
Negative numbers are treated a little differently.
Notice that the highest bit of the binary string is
the "sign". It is possible to assign 0 to be positive
and 1 to be negative, but that's not how it is done
in Ev. Instead a method called
two's complement notation
is used.
To find the negative
of a number
in two's complement notation,
complement all the bases (i.e. switch
0 to 1 and 1 to 0) and add 1.
For example,
in the
standard example,
the weight 0A
(at positions 1 to 5)
has the sequence
T |
C |
G |
A |
C |
sequence |
11 |
01 |
10 |
00 |
01 |
binary |
00 |
10 |
01 |
11 |
10 |
complement |
place value: |
"sign" |
256 |
128 |
64 |
32 |
16 |
8 |
4 |
2 |
1 |
binary: |
0 |
0 |
1 |
0 |
0 |
1 |
1 |
1 |
1 |
0 |
128 + 16 + 8 + 4 + 2 = 158.
Now add 1 and we get 159.
The number in the weight matix
at 0A is the negative of this, -159.
So, to summarize the answer to the question,
the numbers in the weight matrix are encoded in the
genetic sequence using two's complement notation.
-
In the
screenshot,
how many sites have a
T at coordinate +2 (the third base)?
12 out of 16.
-
How do I set a number?
In the control panel are the words 'Cycles to run',
with a box to the right that contains a number:
To the right of the number are two triangles.
If you click on the upper one the number will increase;
with the lower one the number will decrease.
The increments are 10,000.
If you want to set the number directly,
click in the white area that contains the number.
This will make a vertical line inside the box:
Use your keyboard arrow keys or
Click a second time inside the number to move the line:
Type a 0 and you should have made the number bigger:
Now the number of generations has been set much higher.
You can also use your delete key(s) to remove digits
to the left (or right on a Mac) of the vertical line.
Finally,
you can hold the mouse button down and
sweep your mouse over the entire number
to highlight it. Then you can
type a new one in its place.
-
Find a strong
(Cyan colored)
weight while the simulation is running
and watch the corresponding positions in the binding sites.
What happens?
Usually the position in the binding site has the
same base, but not always. It can be lost by mutation,
but after a while it is generally regained
(unless the weight matrix changed).
Also, there can be two or more weights that are strong
for a given position in the site.
The letters flicker over time, but because the sites
are under selection, some contacts are maintained over time.
Positions that do not change much in time are
highly conserved.
However,
if you watch closely for a long time,
you will see that the overall pattern drifts.
The sequence conservation can be measured in bits
of information.
-
Pick a position on the genome that is not part of the gene and is not
in one of the binding sites. These are places that are not above any
colored rectangle. What do you see when you run the evolution quickly?
Why?
The position should change very quickly, flickering
between all four bases.
This position is not contributing information to the binding sites,
so it is not conserved over time.
The effect is called
'neutral drift'
(google search for neutral drift').
-
While the simulation is running at top speed,
position your mouse over the Selection button but don't
click it yet.
Now find a well-conserved binding site base in the genome
(i.e. one that has a high corresponding weight matrix
and which is therefore stable).
Watch this position while you click your mouse to turn off selection.
What happens?
Positions that are (reasonably!) stable under selection
will start changing quickly when there is no selection.
That is, the sequence conservation
is lost and the binding sites
atrophy.
-
Having watched the decay of sequence conservation,
what happens if you turn
selection on again? Is the base still conserved?
Why or why not? Can you control this?
If you let them decay a long time, then when
the organisms are selected again, a new binding site
pattern will be established.
So the base that was conserved before might not
be in the new generations.
If you let them decay only a short time, you may
still retain the same pattern in the binding sites.
-
How many t's are at position 2 in the sites?
There are:
12 t's at position 2 and
2 c's
1 a
1 g
for a total of 16 bases.
-
Watch the last base of the site in both the sequence logo
and the genome.
What happens?
Over the first 1000 generations the last base at position
5 becomes a completely conserved A.
You can see this in all the binding sites.
The sequence logo shows a tall A at that position.
-
Once a sequence logo has emerged, turn off selection.
What happens to the logo?
The sequence logo collapses.
-
What happens to the logo when you turn selection on again?
The sequence logo reappears but often with a different pattern
than before.
If you give only a short time for the pattern to atrophy,
then the newly established pattern may be similar to
the original one.
If you wait too long, the new pattern will usually
be very different.
-
How Speed is determined.
The speed control introduces a delay of approximately 1000 /
2speed-1
milliseconds.
The delay ranges from approximately 1 second for a value of 1, 1/2 second
for a value of 2, ..., 1 millisecond for a value of 11, ..., down to 1
microsecond for a value of 21.
A microsecond delay is only 1 second in 1,000,000 generations, so
it is essentially zero delay, or full speed.