By downloading this code you agree to the
Source Code Use License (PDF). |
{ version = 1.35; (* of mfoldfea.p 2007 mar 15}
(* begin module describe.mfoldfea *)
(*
name
mfoldfea: convert mfold RNA structure to lister feature
synopsis
mfoldfea(cootab: in, book: in,
mfoldfeatures: out, output: out)
files
cootab: Concatenation of the .ct files from mfold that correspond
to structures in the book.
mfoldfeatures: features for the lister program.
book: the Delila book used by mfoldseq to make the mfoldsequ, which was
then used by mfold to make the cootab (coordinate table) file.
output: messages to the user
description
Convert the mfold RNA structures to features for the lister program.
Mfold will only fold one sequence at a time. So the mfoldseq program is
used a series of times inside the run.mfold script to get all the folds in
a book. These are gathered together by run.mfold into the cootab file.
This program then takes the cootab file along with the original book to
generate features. It uses the original book to get the coordinate
system, which was lost in converting to the mfold file format.
examples
Starting from a book, run mfoldseq to get a sequ file that contains one
sequence. For the second sequence in the mfold.book example, mfoldseq gives
this mfoldsequ file:
;
mfold.demo.2#2
AAAACCAGGCGCAAATTTTTTTTGCGCCGGG1
Mfold generates the mfoldsequ.ct
31 ENERGY = -14.5 mfold.demo.2#2
1 A 0 2 0 1
2 A 1 3 0 2
3 A 2 4 0 3
4 A 3 5 0 4
5 C 4 6 30 5
6 C 5 7 29 6
7 A 6 8 0 7
8 G 7 9 28 8
9 G 8 10 27 9
10 C 9 11 26 10
11 G 10 12 25 11
12 C 11 13 24 12
13 A 12 14 23 13
14 A 13 15 22 14
15 A 14 16 21 15
16 T 15 17 0 16
17 T 16 18 0 17
18 T 17 19 0 18
19 T 18 20 0 19
20 T 19 21 0 20
21 T 20 22 15 21
22 T 21 23 14 22
23 T 22 24 13 23
24 G 23 25 12 24
25 C 24 26 11 25
26 G 25 27 10 26
27 C 26 28 9 27
28 C 27 29 8 28
29 G 28 30 6 29
30 G 29 31 5 30
31 G 30 0 0 31
which is input to this program in the cootab file, along with the original
book.
The structure, from mfoldsequ.out.html, is:
FOLDING BASES 1 TO 31 OF mfold.demo.2#2
ENERGY = -14.5
10
AAAA| A TT
CC GGCGCAAA
GG CCGCGTTT T
---G^ - TT
30 20
When the mfoldfeatures are given to lister, the resulting list is:
*10 * *20 * *30 *
5' a a a a c c a g g c g c a a a t t t t t t t t g c g c c g g g 3'
(-(---(-(-(-(-(-(-(-(-----------)-)-)-)-)-)-)-)-)-) helix.mfold.demo.2#2[8.30]-14.5
The parenthesis show bases that are paired together.
documentation
see also
The script that runs mfoldseq and mfoldfea: run.mfold
The program that pulls sequences from the book for run.mfold: mfoldseq.p
The program that uses the mfoldfeatures result: lister.p
Small book for demonstrating folding: mfold.book
More documentation on mfold is at Michael Zuker's Web site:
http://www.rpi.edu/~zukerm/
author
Thomas Dana Schneider
bugs
technical notes
The mfold program has no clue about coordinate systems as in the Delila
system, it works with from 1 to n numbering like GenBank. As a result the
folded coordinates need to be mapped to the book coordinates so that the
features can be written properly. This is a little tricky because there
can be multiple folds per sequence. So the mfoldseq program extracts a
single sequence and numbers it like EG12#18. This name passes through
mfold unscathed. The mfoldfea program then separates apart the EG12 part
as the name from the 18 as the book number. When mfoldfea is writing the
features, both the name and the sequence number must match to the book.
The number is just counted, it is independent of the number that a user
can assign with the 'set numbering' command in Delila (<NUMBERING
DEFAULT>). The coordinates then are taken from the book.
*)
(* end module describe.mfoldfea *)
{This manual page was created by makman 1.45}
{created by htmlink 1.62}