By downloading this code you agree to the
Source Code Use License (PDF). |
{ version = 1.05; (* of hc.p 2017 Jul 12}
(* begin module describe.hc *)
(*
name
hc: H-curve and H uncertainty
synopsis
hc(inst: in, book: in, hcp: inout, colors: in,
namebook: in, namelist: in, avalues: in,
list: out, clist: out, xygraph: out, output: out)
files
inst: delila instructions of the form 'get from 56 -5 to 56 +10;'
If this file is empty, then the sequences will be
aligned either by their 5' ends or by their zero base,
depending on the 4th parameter in hcp.
book: the book generated by delila using inst
hcp: parameters to control the program. If empty, the range of the
instructions are used. Otherwise,
0. The version of hc that this parameter file is designed for.
If the program finds an old version, it will *upgrade* the
hcp file.
1. The first line contains one line with two integers defining the
range of basesto display. This allows one to have a wide alignment,
but look only at a portion.
2. If the first character of the second line is: 'p' the piece name
and coordinates are given in the list. If it is ' ' then neither
is given. If one has only one piece that one is working with,
one may not want the piece name but will want the coordinate
In this case use 'c'.
If the second character is 'l', then the
long name of the piece is given in the list preceeding the piece
name. Note that this long name can be written into the book from
the instructions by the Delila "name" instruction. (See <NAME> in
the libdef.) Blank names (ie, 'name "";') are accepted.
If the third character is not '-', then the sequences are
numbered. If it is '-' they are not.
3. If the first character of the third line is 'n' then paging is not
done to the list.
4. If the first character of the fourth line is
'f' (for 'first') then the sequences are always aligned by their
first base.
'i' then the sequences are aligned by the delila instructions. If
the inst file is empty, alignment is forced to the 'b' mode.
'b' (for 'internal') then the alignment is on the internal zero of
the book's sequence. This option is to be used when "default
coordinate zero" is used in the Delila instructions.
The following table should clarify the cases and their uses:
state |instructions empty | instructions exist
------|-------------------|-------------------------------------------
| | instruction alignment | book alignment
| | (def coo nor) | (def coo zer)
|-------------------|------------------------|------------------
'f' | first base | first base | first base
| (first base) | (first base) | (first base)
|-------------------|------------------------|------------------
'i' | book | inst | inst (DO NOT USE)
| (0) | (aligning base) | (aligning base)
|-------------------|------------------------|------------------
'b' | book | book (DO NOT USE) | book
| (0) | (0) | (0)
The first line of each entry defines how the alignment will be
assigned. Thus 'f' forces the first base to be used at all times and
'b' forces the book to be used. In two case this does not make sense.
First, if the instructions were generated with the "default coordinate
zero", then the Delila instructions do not correspond to the base
coordinates in the book (by definition) and so the alignment should not
use the instruction file. In the second case, the instructions use
"default coordinate normal" so the zero base in the book does not
correspond to the zero base in the instructions. The basic problem
here is that there is no way for the program to know which situation
occurs, without spending time reading the Delila instructions. So the
user must specify. (This may be automated in the future.)
The second line of each entry is the coordinate number which appears on
the left column of the aligned listing.
5. Column number to read from avalues file (integer),
followed by the field width and number of decimal places to write
the values to the list and clist.
6. edgecontrol edgeleft, edgeright, edgelow, edgehigh:
edgecontrol is a single character that controls how the bounding
box of the figure is handled. If it is 'p' then the bounding
box will be the page parameters defined in constants inside the
program (llx, lly, urx, ury). Otherwise, there are four real
numbers that define the edges around the clist in cm. To allow
a clist to be imbedded into another figure, its size must be
defined in PostScript (with %%BoundingBox). By setting these
four numbers, the edges are defined.
7. map control: A series of values:
* mapcontrol: If the first character on the line is a 'C', then the
color map file will be written. If it is 'R' then the page will
be set up so that the upper left corner is moved to the lower left
corner and the image is rotated 90 degrees counter clockwise.
This has the effect of making the image in "landscape" mode.
* fontsize (integer): The character height in points (there
are 72 points/inch, 2.54 cm/inch). Typical value for hc: 15.
8. deltaXcm deltaYcm scaleimage: image positioning controls
* deltaXcm: The amount to move the image in X (cm).
* deltaYcm: The amount to move the image in Y (cm).
* scaleimage: the scaling factor.
The image will be shifted on the printed page. X is positive to the
right and Y is positive up the page. Generally one would use
positive values for X and negative values of Y since the image
should otherwise fit snugly in the upper left corner of the page.
The scaling is performed after movement from the lower left hand
corner of the image as one would read it. If the image has been put
in "landscape" mode the delta-shifts are given in the new coordinate
system. This allows one to switch between "landscape" and regular
"portrait" mode without changing the parameters, and it allows one
to think in terms of a normally held page.
9. headercontrol: the first character on the line determines
whether the header description is written to list and clist.
If the character is 'h' it is written, otherwise not.
Headers can also be removed from the clist by deleting lines
containing the word "NOHEADER". In Unix this is done by:
grep -v clist NOHEADER > clist.noheader
With 'h' the numbar (bar of vertically written numbers) is included
above the sequence, but if the character is '0' (zero) the numbar is
not written. This allows one to use the list file to extract column
data easily, otherwise it is not recommended.
namebook: names of genes or transcripts from this book appear in
the list. If namebook is empty, then only the items specified in
hcp are given.
namelist: if this file is not empty, then it should contain a simple list
of names to give to each sequence listed. These are placed to the
left of the hc and may contain anything one wants. The number of
columns used is determined by the longest line in the file.
avalues: Aligned list values. A file containing values to list for
each of the sequences. If the file is not empty, the values appear
to the right of the sequences. The first line of the file is
expected to begin with "* " followed by the title of the values.
All other lines that begin with "*" are ignored. The program uses
the data column of avalues as defined in the hcp parameter file.
list: the aligned listing
clist: the aligned listing, in PostScript color. Paging is ALWAYS done
to this file, using the page parameter. However, it can be removed
by deleting all lines with the word "REMOVE" on them. This is
easily done in Unix with:
grep -v clist REMOVE
xygraph: H curve xy coordinates for each sequence
colors: colors defining the bases, see makelogo for definition.
output: messages to the user
description
Build H-curve or H-computation (or both) on top of the alist program.
Hc is like alist in that it creates an aligned listing of a set
of sequences. However, the value column is the Shannon uncertainty
in bits.
In addition, hc produces an xygraph file which gives the H-curve
coordinates (starting at zero) for each sequence. These can be
plotted using denplo.
documentation
@article{Hamori.Ruskin1983,
author = "E. Hamori
and J. Ruskin",
title = "{H curves, a novel method of representation of nucleotide
series especially suited for long DNA sequences}",
journal = "J Biol Chem",
volume = "258",
pages = "1318--1327",
pmid = "6822501",
year = "1983"}
see also
Program that does aligned listings upon which hc is built: alist.p
program that produces the book: delila.p
search program to help locate sites: search.p
example inst: spliceA.in
example book: spliceA.bk
example aligned listing parameter file: hcp
example colors file: colors
To learn about page printer boundaries, go to
https://alum.mit.edu/www/toms/postscript.html#tricks
author
Thomas D. Schneider
bugs
If you use relative instructions, then hc will bomb.
Ie, do not use instructions of the form:
get from gene beginning - 5 to gene beginning +5;
There is also an unsolved bug in hc:
When the pieces and instructions are not 'just right', hc will
produce listings that are thousands of characters wide... The reason
for this is not completely clear, but it is related to attempting
to extend the from-to range of an aligned book, and perhaps to incorrect
responses of delila when attempting to 'reduce' a piece beginning or
ending that is off the end of a fragment of a circular piece. The code
now contains traps that halt the program when wide listings would have
been generated. This bug may have been solved.
Alist cannot align a sequence if the alignment point is outside the
sequence.
Note: it is possible to use the 'i' mode when "default coordinate zero"
has been set, but this can lead to confusing output. There is no simple
mechanism to prevent this in DelilaI.
[1995 Dec 7] The namebook mechanism is currently broken for the clist.
technical notes
The variable 'nametype' defines the kind of name picked up in namebook.
The constant 'pagelength' defines the length of the page in the list.
The constant 'topofpage' defines the top of the page in cm in the clist.
There are 4 constants that tell the program the printer page boundaries:
The following bounding box is for the Canon Color Laser Copier 1150.
defaultllx = 7.10999; default for llx, lower left x
defaultlly = 7.01995; default for lly, lower left y
defaulturx = 588.15; default for urx, upper right x
defaultury = 784.98; default for ury, upper right y
These should be set for your printer. To see how this is
done, go to the link given in the See Also.
Alternatively, you can use the edgecontrol parameter.
As of version 5.96, hc can sense that a parameter file (hcp) is out
of date and it will automatically upgrade the file. For this reason the
parameter file is now listed as 'inout', meaning that it can be modified
by this program.
*)
(* end module describe.hc *)
{This manual page was created by makman 1.45}
{created by htmlink 1.62}