By downloading this code you agree to the
Source Code Use License (PDF). |
{version = 1.11; (* of normreg.p 1995 October 24}
(* begin module describe.normreg *)
(*
name
normreg: normalize results from sequence/value linear regression
synopsis
normreg(normregp: in, fresep: out, output: out)
files
normregp: parameters to control the program.
First line: maxsequences (integer): the maximum number of sequences to
generate. This determines the precision of the result.
Remaining lines: values to normalize, 5 per line.
The first, an integer, is the position in the binding site.
The remainder are real, the 4 linear regression weights for A, C, G,
and T.
fresep: 5 integers per line, giving the number of sequences
(maxsequences) and the number of bases at each frequency.
output: messages to the user
description
We would like to view the results of a linear regression of sequences versus
measured values by a sequence logo. This program generates the 'frequencies'
and produces them in a form useful by the program frese.
examples
Using a value of 1000 for maxsequences and the normalized data in figure 3b
of Barrick.ribosomes1994 (see documentation) the normregp is:
1000
-11 -0.36 -0.70 0.70 -0.23
-10 -0.21 -0.59 0.52 -0.05
-9 -0.03 -0.60 0.56 -0.31
-8 0.12 -0.59 0.06 0.22
-7 0.37 -0.48 0.22 -0.38
-6 0.31 -0.51 0.07 -0.04
-5 0.18 0.08 -0.38 0.04
-4 0.04 -0.28 0.26 -0.10
-3 0.61 -0.16 -0.68 -0.23
-2 0.30 -0.02 -0.44 0.02
-1 -0.06 -0.07 -0.20 0.27
0 1.14 -2.28 -0.76 -1.18
Since normalizing data that are already normalized has no effect, these
can be used as input to this program. The results given to output are:
normreg 1.10
information Afrequency Cfrequency Gfrequency Tfrequency
0.2255 0.1743 0.1241 0.5031 0.1985
0.1197 0.2027 0.1386 0.4207 0.2379
SumInteger = 1001 maxsequences = 1000
at position -10, 1 was added to Ainteger to get them to sum properly
SumInteger = 1002 maxsequences = 1000
at position -10, -1 was added to Ainteger to get them to sum properly
0.1410 0.2424 0.1371 0.4373 0.1832
SumInteger = 999 maxsequences = 1000
at position -9, 1 was added to Ainteger to get them to sum properly
0.0565 0.2826 0.1389 0.2661 0.3123
0.0926 0.3623 0.1548 0.3118 0.1711
0.0562 0.3411 0.1502 0.2683 0.2404
SumInteger = 999 maxsequences = 1000
at position -6, 1 was added to Ainteger to get them to sum properly
0.0284 0.2989 0.2705 0.1707 0.2599
0.0283 0.2603 0.1890 0.3244 0.2263
SumInteger = 999 maxsequences = 1000
at position -4, 1 was added to Ainteger to get them to sum properly
0.1681 0.4608 0.2134 0.1269 0.1989
0.0463 0.3379 0.2454 0.1612 0.2554
SumInteger = 999 maxsequences = 1000
at position -2, 1 was added to Ainteger to get them to sum properly
0.0235 0.2353 0.2329 0.2045 0.3273
0.9402 0.7809 0.0255 0.1168 0.0767
SumInteger = 1001 maxsequences = 1000
at position 0, 1 was added to Ainteger to get them to sum properly
SumInteger = 1002 maxsequences = 1000
at position 0, -1 was added to Ainteger to get them to sum properly
When the fresep is then run through fresep, makebk, alist (to make sure all is
ok), encode, rseq, dalvec and makelogo, the result is figure 5b in
Barrick.ribosomes1994.
documentation
@article{Barrick.ribosomes1994,
author = "D. Barrick
and K. Villanueba
and J. Childs
and R. Kalil
and T. D. Schneider
and C. E. Lawrence
and L. Gold
and G. D. Stormo",
title = "Quantitative Analysis of Ribosome Binding Sites in
{{\em E. coli.}}",
journal = "Nucl. Acids Res.",
volume = "22",
pages = "1287-1295",
comment = "1994 April 11. 22(7)",
year = "1994"}
see also
frese.p
author
Thomas Dana Schneider
bugs
technical notes
When rounding a set of real numbers to integers, they will not always add to
the exact required. Although this is a minor detail, the frese program
cannot work unless the numbers all add to the same value at every position.
So this program detects when the integers do not add to the maxsequences,
and then searches for a solution by adding or subracting from the A integer
value. The search is conducted in the series 1, -1, 2, -2, 3, -3 ... and
either +1 or -1 is given by the example shown above. Higher cases are NOT
expected. Since this only modifies the last decimal place (for maxsequences
a power of 10) it does not significantly alter the sequence logo.
*)
(* end module describe.normreg *)
{This manual page was created by makman 1.45}
{created by htmlink 1.62}