By downloading this code you agree to the
Source Code Use License (PDF). |
{ version = 1.09; (* of clual.p 2016 Jan 20}
(* begin module describe.clual *)
(*
name
clual: clustal to alpro conversion
synopsis
clual(clustalout: in, clualp: in, protseq: out, output: out)
files
clustalout: output of the CLUSTAL program
protseq: input to the alpro program
clualp: parameters to control the program. The file must contain the
following parameters, one per line:
parameterversion: The version number of the program. This allows the
user to be warned if an old parameter file is used.
The second line of clualp must match the first line
of the clustalout file. This is used to check that
the clustalout file is correct.
verbose (character): If the first character of the third line is
a 'v' then the program will name the segment numbers as it reads it in,
and then give the name of each sequence as it is written out.
output: messages to the user
description
Convert from CLUSTAL format to allow one to present COG output
as a sequence logo. The CLUSTAL format is broken up into segments.
Alpro requires continuous sequences. This program rearranges the
CLUSTAL data to the form alpro needs.
examples
The first line of a clustal file looks like this:
CLUSTAL W (1.74) multiple sequence alignment
This is used to check that the input is good.
documentation
see also
example parameter file: clualp
program that uses the output of this program: alpro.p
program that finally generates the sequence logo: makelogo.p
COG: http://www.ncbi.nlm.nih.gov/COG/
COG: ftp://ncbi.nlm.nih.gov/pub/COG/
an example alignment:
http://www.ncbi.nlm.nih.gov/COG/aln/COG0526.aln
the entire list of alignments, ready to grab:
ftp://ncbi.nlm.nih.gov/pub/COG/aln/
wget can be used to grab the alignments:
https://alum.mit.edu/www/toms/wget.html
Why one should not use consensus sequences:
https://alum.mit.edu/www/toms/glossary.html#consensus_sequence
author
Thomas Dana Schneider
bugs
technical notes
The clustal format has waste spaces. If "_" represents a space,
then we have at the boundary of two segments:
YPR082c_________------------------------------------------------------------
____________________________________________________________________________
_
BS_resA_________------------------
This program ignores the spaces, but one wonders why they are there ...
AH!!! These contain STUPID consensus sequences!!!
The program will ignore this idiotic data line.
*)
(* end module describe.clual *)
{This manual page was created by makman 1.45}
{created by htmlink 1.62}