Delila Program: calhnb

# calhnb program

## By downloading this code you agree to the Source Code Use License (PDF). Pascal source code: calhnb.p (wget instructions) Instructions on compiling MacOS binary: calhnb Alphabetic List of Delila Programs Delila Programs by Most Recent Update Please report broken links delilabundle.zip = All Programs and MacOS Binaries Copyright Statement for Delila Programs

### Documentation for the calhnb program is below, with links to related programs in the "see also" section.

```{ version = 2.29; (* of calhnb.p 2005 Jul 16}

(* begin module describe.calhnb *)
(*
name
calhnb: small-sample correction for information and uncertainty

synopsis
calhnb(fin: in, fout: out, output: out)

files
fin: the genomic composition (integers) on one line followed by
a set of integers, one per line representing values of n

fout: a table showing n, e(hnb), ae(hnb) and their difference.
the variances var(hnb) and avar(hnb) are tabulated along with
the difference between their square roots.  This is the difference
between the standard deviations.  e(n) is found from the genomic
uncertainty minus e(hnb).  Finally, sd(n) = sqrt(var(hnb)) is given.

output: messages to the user.

describe

Given a genomic composition and a series of integers (n) that represent
the number of sample sites, calhnb calculates the sampling error as e(hnb)
and the variance var(hnb).  It also finds the approximations ae(hnb) and
avar(hnb).  These values are presented in a table along with the
differences between the exact and approximate calculations.  This table
will allow a user to decide when to use the approximations.  Beware that
the exact calculation becomes very expensive for large n.  For this
reason, I use the approximate computation for n > 20 in rseq and alpro.

examples

When used as fin, the calhnb.fin file should generate the calhnb.fout file
in the fout.  The data should be identical those given in Figure A.2 on
page 428 of the Appendix of Schneider et al 1986.

documentation

"Information content of binding sites on nucleotide sequences"
T. D. Schneider, G. D. Stormo, L. Gold, and A. Ehrenfeucht
JMB 188:415-431 (1986)  [see link below]

Example       input  file, fin:  calhnb.fin
Corresponding output file, fout: calhnb.fout

fin  file for values up to n = 50: calhnb.50.fin
fout file for values up to n = 50: calhnb.50.fout

Discussion about correctiing for small sample size:
https://alum.mit.edu/www/toms/small.sample.correction.html

Schneider et al. (1986):
https://alum.mit.edu/www/toms/paper/schneider1986

related programs: rseq.p, alpro.p

author

Thomas D. Schneider

bugs

It would be nice to have a generalized algorithm for any number
of symbols.

*)
(* end module describe.calhnb *)
{This manual page was created by makman 1.45}

```