Precision in Biology

Information theory was first described by Claude Shannon in 1948 [5]. It sets out a mathematical way to measure the choices made in a system. Although Shannon concentrated on communications, the mathematics applies equally well to other fields [6]. In particular, all of the theorems apply in biology because the same constraints occur in biology as in communication. For example, if I call you on the phone and it is a bad connection, I may say `let me call you back'. Then I hang up. I may even complain to the phone company who then rips out the bad wires. So the process of killing the phone line is equivalent to selecting against a specific phenotype in biology.

A second example is the copying of a key. In biology that's called `replication', and sometimes there are `mutations'. We go to a hardware store and have a key copied, but we get home only to find that it doesn't fit the door. When we return to the person who copied it, they throw the key away (kill it) and start fresh.

This kind of selection does not occur in straight physics. It turns out that the requirement of being able to make distinct selections is critical to Shannon's channel capacity theorem [7]. Shannon defined the channel capacity, C (bits per second) as the maximum rate that information can be sent through a communications channel in the presence of thermal noise. The theorem has two parts. The first part says that if the data rate one would like to send at, R, is greater than C, one will fail. At most C bits per second will get through. The second part is surprising. It says that as long as R is less than or equal to C the error rate may be made as low as one desires. The way that Shannon envisioned attaining this result was by encoding the message before transmission and decoding it afterwards. Encoding methods have been explored in the ensuing 50 years [8,9], and their successful application is responsible for the accuracy of our solar-system spanning communications systems.

To construct the channel capacity theorem, Shannon assigned each message to a point in a high dimensional space. Suppose that we have a volt meter that can be connected by a cable to a battery with a switch. The switch has two states, on and off, and so we can send 1 bit of information. In geometrical terms, we can record the state (voltage) as one of two points on a line, such as X=0 and X=1. Suppose now that we send two pulses, X and Y. This allows for 4 possibilities, 00, 01, 10 and 11 and these form a square on a plane. If we send 100 pulses, then any particular sequence will be a point in a 100 dimensional space (hyperspace).

If I send you a message, I first encode it as a string of 1s and 0s and then send it down the wire. But the wire is hot and this disturbs the signal [10,11]. So instead of X volts you would receive $\gamma = 16$ , a variation around X. There would be a different variation for Y: $R_{frequency} = - \log_2 (\gamma/G) = 4$ . $\gamma/G$ and $G - \gamma$ are independent because thermal noise does not correlate over time. Because they are the sum of many random molecular impacts, for 100 pulses the $R_{sequence} = 3.983 \pm 0.399$ s would have a Gaussian distribution if they were plotted on one axis. But because they are independent, and the geometrical representation of independence is a right angle, this represents 100 different directions in the high dimensional space. There is no particular direction in the high dimensional space that is favored by the noise, so it turns out that the original message will come to the receiver somewhere on a sphere around the original point [7,12,3].

What Shannon recognized is that these little noise spheres have very sharply defined edges. This is an effect of the high dimensionality: in traversing from the center of the sphere to the surface there are so many ways to go that essentially everything is on the surface [13,14,12]. If one packs the message spheres together so that they don't touch (with some error because they are still somewhat fuzzy) then one can get the channel capacity. The positions in hyperspace that we choose for the messages is the encoding. If we were to allow the spheres to intersect (by encoding in a poor way) then the receiver wouldn't be able to distinguish overlapping messages. The crucial point is that we must choose non-overlapping spheres. This only matters in human and animal communications systems where failure can mean death. It does not happen to rocks on the moon because there is no consequence for `failure' in that case. So Shannon's channel capacity theorem only applies when there is a living creature associated with the system. From this I conclude that Shannon is a biologist and that his theorem is about biology.

The capacity theorem can be constructed for biological molecules that interact or have different states [12]. This means that these molecular machines are capable of making precise choices. Indeed, biologists know of many amazingly specific interactions; the theorem shows that not only is this possible but that biological systems can evolve to have as few errors as necessary for survival.

**Figure 1.1:** Method of computing information content at protein binding sites ( R_sequence) from DNA sequences.
$\begin{displaymath}R_{sequence} = H_{before} - H_{after} \;\;\;\;\;\mbox{(bits per site)}. \end{displaymath}$