Information theory was introduced by Claude Shannon in 1948 to precisely characterize data flows in communications systems. The same mathematics can also be fruitfully applied to molecular biology problems. We start with the problem of understanding how proteins interact with DNA at specific sequences called binding sites. Information theory allows us to make an average picture of the binding sites and this can be shown with a computer graphic called a sequence logo (https://alum.mit.edu/www/toms/glossary.html#sequence_logo).
Sequence logos show how strongly parts of a binding site are conserved in bits of information. They have been used to study a variety of genetic control systems. More recently the same mathematics has been used to look at individual binding sites using another computer graphic called a sequence walker (https://alum.mit.edu/www/toms/glossary.html#sequence_walker). Sequence walkers are being used to predict whether changes in human genes cause mutations or are neutral polymorphisms. It may be possible to predict the degree of colon cancer by this method.
How do genetic systems gain information by evolutionary processes? Information theory was used to observe information gain in the binding sites for an artificial `protein' in a computer model of evolution. The model begins with zero information and, as in naturally occurring genetic systems, the information measured in the fully evolved binding sites is close to that needed to locate the sites in the genome. The transition is rapid, demonstrating that information gain can occur by punctuated equilibrium. (https://alum.mit.edu/www/toms/paper/ev).
Thomas D. Schneider is a tenured Research Biologist at the National Cancer Institute in Frederick, Maryland. He graduated from MIT in biology (1978) and received his Ph.D. from the University of Colorado in molecular biology (1986). His primary work is analyzing the binding sites of proteins on DNA and RNA in bits of information. However, the theory goes beyond this to characterize how molecules interact and their states. Once these properties are understood, we have an engineering concept of how the molecules make specific choices in the presense of the ubiquitous thermal noise. Turning this around, we can design the molecules to perform useful functions. This is, of course, known as nanotechnology. Thus the lab has a number of pure experimental research projects on a variety of molecular systems, including bacteriophage lambda, bacteriophage T7, and cancer related p53 biology. We also have nanotechnology projects on molecular computers (patented), a molecular rotation engine (patent pending) and DNA sequencing (patent pending).
origin: 2005 Aug 30
updated: 2005 Aug 30