By downloading this code you agree to the
Source Code Use License (PDF). |
{ version = 3.61; (* of makewalker.p 2006 Jul 07}
(* begin module describe.makewalker *)
(*
name
makewalker: walk an information weight matrix across a sequence
synopsis
makewalker(book: in, ribl: in, colors: in, makewalkerp: in,
walk: out, output: out)
files
book: a book from the Delila system
ribl: a weight matrix from the Ri program
colors: definitions of how to color letters. See makelogo.p for details.
makewalkerp: parameters to control this program
The first line must be the version number of the program.
This allows the program to recognize when the parameter file is old.
rangefrom: integer, FROM of the ribl matrix to use.
rangeto: integer, TO of the ribl matrix to use.
basesperline: integer, number of bases per line to display.
linesperpage: integer, number of lines per page to display.
basenumber: integer, the base on the line to place the zero of the walker
at initially on the page. It must be between 0 and basesperline - 1.
Counting begins at zero on the left side of the page.
linenumber: integer, the line number to place the zero of the walker at
initially on the page. It must be between 0 and linesperpage - 1.
Counting begins at zero on the bottom of the page.
coornumber: integer, the coordinate number to place the zero of the
walker at initially. If this number is not found in the piece
coordinate system, the walker will be placed at the beginning of the
sequence when coornumber's value is zero or negative and placed at the
end of the sequence when coornumber's value is positive.
pagewidth: real, the width of the lines of sequence in cm.
pageheight: real, the height of the lines of sequence in cm.
pagex: real, the x coordinate of the page lower left corner in cm.
pagey: real, the y coordinate of the page lower left corner in cm.
lowerbound: real < 0, the lowest Ri(b,l) value in bits that can be fully
displayed (bases with lower values are clipped and have a red line on
the bottom).
boxes: charcter: if 'b' then the walker characters are surrounded by
character-boxes as defined below. Otherwise the boxes are invisible.
outofsequence: charcter: if 'o' then the walker is set next to the
sequence. Otherwise the walker is in line with the sequence. Thanks
to Seth Taylor for suggesting this option on 1994 November 22.
ALL LINES FOLLOWING THIS POINT: These are inserted into the walk
as commands before the initial display.
walk: A postscript program that implements the walk.
It is to be run with ghostscript:
gs -q walk
Ghostscript then pops up a graphics window and the user types commands to
control the display. (The -q just makes ghostscript quiet on startup.)
The program reports information to the user that include the position,
the individual information for the current position (Ri, bits) and the Z
score for this Ri given the mean (Rsequence) and standard deviation of
the original population of sequences used to create the ribl matrix.
When the absolute value of the Z score is less than or equal to 2, an
arrow (<---) indicates that the position is likely to be a site.
Likewise, when the Ri value is positive, this is indicated by plus signs
(++++). (The actual test can be set by the user.) The user can type '?'
or 'help' to get a list of commands. These commands are discussed in
further detail below.
NOTE: the Ri evaluation is ONLY for the portion of the walker displayed
on the screen.
output: Messages to the user.
description
This program creates a PostScript program, called the "walk", by
reformatting the DNA sequences in a Delila book and joining them to the ribl
matrix. The user then runs the "walk" using the interactive PostScript
interpreter ghostscript. Within the ghostscript graphic page appears part
or all of the sequence(s) in the book. The majority of the letters are
black, but a portion are in color. These letters correspond to the
evaluation of those bases by the Ri(b,l) matrix read from the ribl file.
The height of each letter is proportional to its weight in the matrix. Thus
the user can immediately see the components of the weight matrix as applied
to the particular sequence. The user may then type commands to move the
evaluated region around. The user literally walks the evaluation across the
sequence, and thereby gains a sense of the reaction each part of the
recognizer to each part of the sequence.
GENERAL SCHEME OF A WALKER PAGE
A walker page consists of a rectangular array of character boxes:
<------------- basesperline ------------> (10 in this case)
0 1 2 3 4 5 6 7 8 9
^ ----------------------------------------- ^
p | |152|153|154|155|156|157|158|159|1 |2 | |
a | | | | | | | | | | | | 2 |
g | | | | | | | | | | | | |
e | | | | | | | | | | | | |
h | ----------------------------------------- |
e | |3 |4 |5 |6 |7 |8 |9 |10 |11 |12 | |
i | | | | | | | | | | | | 1 linesperpage
g | | | | | | | ! | | | | | (3 in this case)
h | | | | | | | | | | | | |
t | ----------------------------------------- |
| |13 |14 |15 |16 |17 |18 |19 |20 |21 |22 | |
( | | | | | | | | | | | | 0 |
c | | | | | | | | | | | | |
m | | | | | | | | | | | | |
) v *---------------------------------------- v
*
* <----------- pagewidth (cm) ------------>
*
**** lower left hand corner is at pagex horizontal (cm) and pagey vertical
(cm) on the page, starting from the PostScript default zero coordinate.
The "!" is at basenumber = 5, linenumber = 1, coornumber = 8
All the parameters: basenumber, linenumber, coornumber, basesperline,
linesperpage, pageheight, pagex and pagey are defined independently. The
physical positioning parameters pagex, pagey, pagewidth and pageheight
determine where the entire set of character boxes is placed on the page.
Each character box size is determined by the basesperline and linesperpage
so that the required number fit the defined area of the page. The zerobase
of the walker is set initially at the coordinate given by basenumber and
linenumber. The coordinates of the bases for the rest of the sequence are
determined by the coordinate of the zerobase of the walker.
Note that the coordinate system in the example above represents a fragment
of a circular DNA, with coordinates running from 152 up to 159, followed by
a jump to the start of numbering at 1 and then proceeding up to 22. (These
kinds of coordinates can be generated and handled by Delila programs.)
GENERAL SCHEME OF A WALKER CHARACTER BOX
+---+ <-- 2 bits per base
| |
|---| <-- 0 bits per base
| |
| |
| |
+---+ <-- lowerbound bits per base
The box has a part above zero in which letters appear upright and a part
below zero in which the letters appear rotated 180 degrees if they are
within the evaluated region or black and upright if they outside.
If the walker is out of the sequence, then a gap of height 1 bit is
created just above the 2 bits mark. The sequence is put there. The rest
of the characterbox is scaled accordingly.
Bases which have positive Ri(b,l) values run upward from 0 to 2 bits,
those that have a negative value run downward. If a base evaluates to a
number of bits lower than lowerbound, it will be drawn down but any amount
below lowerbound is cutoff. To indicate this situation, the background
becomes purple. If the base has a value less than -log2(n) bits (where n
is the number of sequences used to make the ribl model), it is considered
to be negative infinity, and the background becomes black.
COMMANDS
When the walk program is run in GhostView, the user can control the
display by means of typed commands. These commands are built from
PostScript procedures. This means that any arguments must be given before
the command itself. This may feel a little strange at first, bit it is
easy to get used to. For example, to go to location 132, the user types:
132 goto<cr>
where <cr> is a carriage return.
# means that the command is proceeded by a number.
* means not implemented yet
Movement Commands: These commands affect the direction that the walker or
the sequence moves. Which moves depends on the w command. The commands
are the same as those of the Unix editor vi.
# h: move left on the page (# is optional)
# j: move down on the page (# is optional)
# k: move up on the page (# is optional)
# l: move right on the page (# is optional)
Move commands may have an integer in front which says how many times to
move. The program will repeat the command.
* n: next sequence
* p: previous sequence
w: A toggle between two states:
the walker moves along the stationary sequence,
or
the sequence moves along the stationary walker.
q: quit
?: help message
r: Refresh the page.
R: restore or restart ghostscript on the current walk file. This allows
one to start over or to modify the walk and restart without quitting
ghostscript. The modification could be done by the makewalker program,
by hand-editing or by another program.
cl: clear the ghostscript command screen.
# A,C,G,T: Mutate the given absolute location to the desired base. For
example, to set base 100 to be an "A", type "100 A".
# a,c,g,t: Mutate the given relative location to the desired base. The
location is relative to the current position of the walker. For
example, to set the base 10 to the left of the walker zero to be an
"a", type "-10 a".
# setwait: set the wait time in seconds after display (starts at zero)
# isasecond: set the number of {1 pop} cycles per second. This depends on
how fast your computer is and should be adjusted.
# goto: Type a coordinate and then "goto". For example, to get to
coordinate 100 type "100 goto". The zero base of the walker will be
set to the coordinate.
# invert: invert the Ribl matrix. This is only useful if you have an
asymmetric binding site.
# jump: Like goto except one gives the relative number of bases to move.
For example, to move 5 bases in the 5' direction, type "-5 jump". The
zero base of the walker will be set to the new coordinate.
boxes: toggle between having boxes and not. These are mostly helpful
for seeing where things are on the page.
# lines: Set the number of lines per page, eg type "3 lines".
# bases: Set the number of bases per page, eg type "30 bases".
("wide" can also be used)
# left, right, up, down: move the graphic on the page in units of cm.
example: "0.5 right" moves the graphic right half a cm.
# height, width: set the page height or width in cm.
in: Put the walker into the sequence.
out: Put the walker out of the sequence.
# wave: define base at which the low point of the cosine wave is set.
example: "5 wave" puts the low point at base +5.
waveon: Turns on drawing the wave.
waveoff: Turns off drawing the wave.
toggleprinting or tp: a toggle that turns on and off printing. This allows
one to give several commands without seeing the display change. Turning
printing on automatically causes a display.
NOTE: printing is initially off to allow displays to be created without
showing anything. It may be turned on as the first user command
following the other makewalkerp parameters.
toggleerase or te: a toggle that turns on and off eraseing the page. In
conjunction with the toggleprinting command this allows one to display
several walkers on a page for making a figure.
togglereport or tr: a toggle that turns on and off reports to output.
If it is placed as the first user defined command in the makewalkerp,
then there will be no output messages and ghostview will not put
up a display message. This is useful for embedding in another figure.
# from: change FROM range of the matrix to use
# to: change TO range of the matrix to use
help: help message
# setri: set minimum Ri for searching and display
# setz: set minimum Z for searching and display
# f: search forward to next site which fits search criteria
# b: search backward to next site which fits search criteria
TO MAKE PRINTOUTS
The walker is interactive, which means that the PostScript showpage function
is not called since it would pause the screen and then wipe out the display
at every command. However, printers require showpage and if it is not
inculded they won't print anything. If you do this they will spend a few
minutes rendering the page and then nothing will come out! To make
printouts, attach:
gsave showpage grestore
to the end of the walk file. The gsave/grestore assure that the graphics
state is not lost during the showpage. You can put any commands you like in
front of the showpage:
180 goto boxes out showpage
This allows one to set up the page as desired.
TO IMBED IN FIGURES
In addition to the note above about showpage, the walk file contains
commands that translate the image. To prevent these from affecting the
surrounding PostScript, they must be enclosed in a gsave-grestore pair.
The gsave is provided at the start of the walk file. The grestore is
provided by the q command.
Commands can be put at the end of the parameter (makewalkerp) file. The
command toggleprint is called before and after these commands, so the
commands are normally not seen. If you surround your commands with calls
to toggleprint, you will see a movie of the actions taken.
The command toggleerase allows one to draw several walkers on a page,
merely by preventing the previously drawn one from being erased. However,
if a figure is imbedded into an AdobeIllustrator figure and toggleerase is
called when printing is active, this action may wipe out other parts of
the figure. This can be prevented by turning off the erase with
toggleerase before turning on the printing with toggleprint.
If the command togglereport is the first command, then the messages sent
to standard output, which appear on the ghostscript control window, are
all suppressed (errors are still reported). This prevents a display
window from popping up in ghostview.
This is an example of what to add to the end of the makewalkerp to make a
figure:
togglereport % turn off messages to output
waveoff 5 up % do some things silently
toggleerase % do this before the toggleprint
toggleprint % turn on printing
6 down l % jump around
toggleprinting toggleprinting % force printing
6 down l % jump around
toggleprinting toggleprinting % force printing
showpage
Do not use copypage for figures as this halts the display.
ACKNOWLEDGMENTS
I thank Seth Taylor for suggesting the mode for the walker being outside
the sequence, Paul Hengen for suggesting the cosine wave applied to the
letters and Denise Rubens for suggesting the mutation function.
examples
-10 rangefrom: integer, FROM of the ribl matrix to use
+10 rangeto: integer, TO of the ribl matrix to use
50 basesperline: integer, number of bases per line to display.
3 linesperpage: integer, number of lines per page to display.
20 basenumber: integer, the base on the line to place the zero of the walker
1 0 linenumber: integer, the line number to place the zero of the walker
132 coornumber: integer, the coordinate number to place the walker zero
18.5 pagewidth: real, the width of the lines of sequence in cm.
24.9 pageheight: real, the height of the lines of sequence in cm.
1.5 pagex: real, the x coordinate of the page lower left corner in cm.
1.5 pagey: real, the y coordinate of the page lower left corner in cm.
-4 lowerbound: real < 0, the lowest Ri(b,l) value in bits displayed
nb boxes: b: boxes around each character
io insequence: i: in the sequence, else out
% all lines from this point on are PostScript commands
% The "%" makes a comment
% makewalkerp: parameters for makewalker 3.03 and higher
% The following commands make a picture of 2 walkers
% waveoff % turn off waves
1 lines % display only one line
10 up % move 10 cm up
5 height % make the line only 5 high
44 wide % show 44 characters across
w 5 h w % move the sequence 5 positions left
132 goto % put the walker in a new spot
toggleprinting toggleprinting % force printing
toggleerase % prevent erasing during the next steps
6 down % jump 6 cm down
143 goto % put the walker in a new spot
toggleprinting toggleprinting % force printing
% gsave showpage grestore % unearth the command if you send this to a printer!
documentation
Ghostscript documentation can be found from:
<a href = http://www.cs.wisc.edu/~ghost/index.html>
http://www.cs.wisc.edu/~ghost/index.html</a>
see also
delila.p, makelogo.p, ri.p, scan.p, dnaplot.p
author
Thomas Dana Schneider
bugs
Known Bughs:
Only one sequence is loaded from the book.
With parameter for 3 lines, reset to 1 line puts the entire display too
low. Yet starting with 1 line it's ok. Some global parmaeter is not being
set in definepageparameters. (Same thing: When there is one line per page
the position is too low, one needs to use (eg) "5 up".)
180 goto 1 goto - it doesn't erase old stuff to left!
Something uses up virtual memory every time the walker takes a step.
Eventually this causes an error and GhostScript dies:
Error: /VMerror in --charpath--
VM status: 0 16061098 16168018
Current file position is 5
XIO: fatal IO error 12 (Not enough memory) on X server ":0.0"
after 47675 requests (45252 known processed) with 2497 events remaining.
Why?
When number of lines per page is changed, the cosine wave height does not
change correctly, often being too small. (Apparently fixed.)
The display glitches sometimes by leaving behind pieces that should get
erased. This occurs when numbers are being are displayed that don't fit
into the available area and get clipped. A relevant location in the code is
in the routine displaywalker at: "white 0 0 charbox fill" A replacement
replacement: "0 0 charbox clip erasepage initclip" does not help. Perhaps
this is the wrong part of the code. It is also possible that the problem is
in ghostscript. The effect sometimes occurs as one is moving the walker
around. Letters that are drawn that go below the lower bound don't get
clipped properly, they leave a slight edge there.
Range checking does not work properly. If the ribl has a range
from -100 to +99, then a request for -99 to +100 bombs. This
should be caught in walker.
Perhaps there should be a function that automatically defines the
lower bound in bits so that the user does not need to figure thisout.
Resetting lower bound messes up the display!
f (and probably b) searches don't work when the display is toggled
off. Fortunately this is easy to get around: just determine the
locations and use goto.
If one has a small sequence, visible on the screen and then sets the move
mode to move the sequence with the walker steady (ie use the w toggle),
then when the end of the sequence moves in, the last character is not
removed, so there are repeating bases on the end.
technical notes
Note: encapsulation of the figure requires a gsave and a grestore to
surround the walk code to undo the translation to the basenumber = 0,
linenumber = 0 coordinate and any other translations done by commands.
No showpage is provided, since this does not help during interactive
graphics. Worse, ghostscript pauses at every showpage or copypage, saying:
">>copypage, press <return> to continue<<"
So the user would be forced to type extra carriage returns for every
command. If a showpage is needed for making a printout, it must be added
later as "gsave showpage grestore.
isasecond is a global constant that defines the number of {1 pop} operations
that the display can run through in 1 second. This must be determined for
each computer.
The bounding box for EPS is defined in the constants.
*)
(* end module describe.makewalker *)
{This manual page was created by makman 1.45}
{created by htmlink 1.62}