By downloading this code you agree to the
Source Code Use License (PDF). |
{version = 1.91; (* of dbmutate.p 2016 Jan 24}
(* begin module describe.dbmutate *)
(*
name
dbmutate: mutate genbank database
synopsis
dbmutate(dbin: in, dbout: out, dbmutatep: in,
markspots: out, minst: out, output: out)
files
dbin: GenBank flat file format sequences.
dbout: GenBank flat file format sequences, with mutations
as specified by the parameters.
dbmutatep: parameters to control the program.
The first line must be the version number of the this program. This
allows the program to recognize when the parameter file is old.
A series of lines like this:
K02402 g20633c
This means that entry K02402 is to be grabbed, and the g at 20633 is to
be changed to c. The output entry name will be "K02402.g20633c".
Multiple changes are allowed, separated by spaces:
K02402 t1c g20633c
Entries with no changes are allowed, they are just copied to dbout:
K02402
To make a deletion, give the endpoints of the deletion range:
M55114 d449,450
Both of the end points will be deleted.
The numbers can be the same, to delete one base.
To make an insertion or change, give the endpoints between which to
REPLACE with a new string:
M55114 i449,450tt
The numbers cannot be the same. Use zero (0) to insert before the
start of the sequence and a value larger than the sequence length to
insert after the end of the sequence.
Any number of spaces may be between the parts of each instruction, but
the instructions must be one per line. Blank lines are skipped. Lines
that begin with '*' are comments and are skipped.
The parts of an instruction can be separated by spaces or by periods.
The use of periods makes the notation consistent with the name given to
the pieces generated.
A semicolon indicates that the rest of the line is a comment.
Program parameters can be adjusted by lines that begin with '@'.
The form is '@ commandname value'. The adjustable parameters are:
fromrange: the distance before the first base mutated to get
in the delila instructions (see minst).
torange: the distance after the first base mutated to get
in the delila instructions (see minst).
markspots: The locations to put marks for use by the lister program
in the file marks. They are of the form:
U 1055 0.0 1055 -20.0 0 (g->a) change
where 5391 is the first coordinate given for a mutation These can be
concatinated with a file like marks.arrow to define the locations of
mutations:
cat marks.arrow markspots > marks
The markspots are generated on the assumption that the user will want
to display alternating pairs consisting of wild type sequence followed
by mutant. The 'p' marks command, as defined in the lister program, is
used to jump to the next piece. To use this mechanism start the
dbmutatep with the GenBank entry for wild type sequences. Follow this
by the mutations of that entry. (The program cannot handle more than
one entry properly at this time.) For your delila instructions, write
pairs of wild type sequence followed by mutant sequence.
minst: delila instructions (inst) for grabbing the regions around the
mutations. The from-to range will default to preset values (see
technical notes) or can be adjusted with an "@" command in dbmutatep.
output: messages to the user
description
Make mutation of GenBank sequences easy. Note that the copy function
makes this program supercede dbpull (although this program is probably
going to be much slower).
Note that the insert function is fully capable of not only doing
insertions but also changes and deletions.
Beware that the numbering will be messed up with deletions; multiple
deletions could be conflicting.
THIS PROGRAM IS NOW DEPRECATED because Delila itself
can make mutations. This program is still useful, however,
for people not using the Delila system (shame on you) who
wish to modify a GenBank entry.
examples
1.70 version of dbmutate that this parameter file is designed for.
* Lines that begin with '*' are a comment.
* Substitute the second 10 bases of an entry
K02402 i10,21ggggcccccc
* Inserts before base 0 are considered to be at base 0, and ones after the
* end of the sequence are at the end of the sequence. Here is an insert
* from -20 to 1 of 10 bases followed by a deletion later:
K02402 i-20,1ggggcccccc d61,70
* This kind of double insert and deletion of the same length is useful for
* checking inserts and deletions by using the Unix diff program, because
* only one line changes.
* This one deletes the first 10 bases and makes a compensating insert:
K02402 d-5,10.i20,21ggggcccccc
* note that the instruction parts are separated by a period above.
* Replace exactly 10 bases:
K02402 i10,21cccgggcccc
* Delete exactly 10 bases:
K02402 i10,21
* set the from-to range:
@ fromrange -25
@ torange +5
documentation
see also
Parameter file: dbmutatep
Related Programs:
delila.p, dbbk.p, dbclean.p, dbpull.p, marks.arrow
author
Thomas Dana Schneider
bugs
* changesetmax should not be needed; replace by linked list
technical notes
Constant changesetmax is the largest number of changes allowed per entry.
Constant sequencemax is the largest length sequence that can be handled.
Because the program creates a new accesssion name, it will strip away
any secondary accession names.
default values for the from-to range are in constants deffromrange and
deftorange.
*)
(* end module describe.dbmutate *)
{This manual page was created by makman 1.45}
{created by htmlink 1.62}