Delila Program: encfrq

encfrq program

By downloading this code you agree to the
Source Code Use License (PDF).

Pascal source code: encfrq.p (wget instructions)
Instructions on compiling
MacOS binary: encfrq
Alphabetic List of Delila Programs
Delila Programs by Most Recent Update
Please report broken links
delilabundle.zip = All Programs and MacOS Binaries
Copyright Statement for Delila Programs

Documentation for the encfrq program is below, with links to related programs in the "see also" section.

{version = 1.53; (* of encfrq.p 1994 sep 5}

(* begin module describe.encfrq *)
(*
name
      encfrq: encoded sequence frequency analysis

synopsis
      encfrq(encseq: in, cmp: in, fout: out, output: out)

files
      encseq: the output of the encode program
      cmp: a composition from the comp program.
      fout: frequency tables for each parameter set.  these are followed
         by z values for each frequency.  if cmp is empty, then equal
         frequencies are assumed.
      output: messages to the user.

description
      the frequency of each n-tide (mono- or di- or etc) is displayed in
      fout.  the actual number of sequences passing through a particular
      n-tide and position (ie, a parameter window) is taken into account.
         a second set of tables of z values are also presented.
      these are calculated from the composition provided in comp (p, the
      probability of obtaining the n-tide), the actual number of
      occurences (b) and the number of sequences at that position (n).
      the distribution of b can be described as a binomial distribution,
      with mean (m) np and standard deviation (s) sqrt(npq).  b is then
      normalized to obtain z: z=(b-m)/s.  if n is large, then z is
      normally distributed, and the probabilities can be found on any
      table for the normal distribution (use a two tailed test).  a rule
      of thumb for when the normal distribution can be used is that
      both np and n(1-p) should be greater than 5.  locations that violate
      this rule are marked with a '*'.
         locations of the z table that contain z values of 3 or greater are
      displayed to the right of the z table.  since these look somewhat
      like a dna footprint, they are called z-footprints.  the output
      for dinucleotide z-footprints is very wide, so one must split
      it up using the split program.  recommended values for splitp are
      p/14/112/4, where the slash means "start a new line".

see also
      encode.p, comp.p, split.p

author
      thomas d. schneider

bugs
      none known

*)
(* end module describe.encfrq *)
{This manual page was created by makman 1.45}

{created by htmlink 1.62}