HMMER


Back

HMMER is made up of the following programs:

Contents


hmmalign

  1. hmmalign :: align sequences to a profile HMM
  2. HMMER 3.1b1 (May 2013); http://hmmer.org/
  3. Copyright (C) 2013 Howard Hughes Medical Institute.
  4. Freely distributed under the GNU General Public License (GPLv3).
  5. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmalign [-options] <hmmfile> <seqfile>

Basic options:

 -h     : show brief help on version and usage
 -o <f> : output alignment to file <f>, not stdout

Less common options:

 --mapali <f>    : include alignment in file <f> (same ali that HMM came from)
 --trim          : trim terminal tails of nonaligned residues from alignment
 --amino         : assert <seqfile>, <hmmfile> both protein: no autodetection
 --dna           : assert <seqfile>, <hmmfile> both DNA: no autodetection
 --rna           : assert <seqfile>, <hmmfile> both RNA: no autodetection
 --informat <s>  : assert <seqfile> is in format <s>: no autodetection
 --outformat <s> : output alignment in format <s>  [Stockholm]

Sequence input formats include: FASTA, EMBL, GenBank, UniProt Alignment output formats include: Stockholm, Pfam, A2M, PSIBLAST

hmmbuild

  1. hmmbuild :: profile HMM construction from multiple sequence alignments
  2. HMMER 3.1b1 (May 2013); http://hmmer.org/
  3. Copyright (C) 2013 Howard Hughes Medical Institute.
  4. Freely distributed under the GNU General Public License (GPLv3).
  5. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmbuild [-options] <hmmfile_out> <msafile>

Basic options:

 -h     : show brief help on version and usage
 -n <s> : name the HMM <s>
 -o <f> : direct summary output to file <f>, not stdout
 -O <f> : resave annotated, possibly modified MSA to file <f>

Options for selecting alphabet rather than guessing it:

 --amino : input alignment is protein sequence data
 --dna   : input alignment is DNA sequence data
 --rna   : input alignment is RNA sequence data

Alternative model construction strategies:

 --fast           : assign cols w/ >= symfrac residues as consensus  [default]
 --hand           : manual construction (requires reference annotation)
 --symfrac <x>    : sets sym fraction controlling --fast construction  [0.5]
 --fragthresh <x> : if L <= x*alen, tag sequence as a fragment  [0.5]

Alternative relative sequence weighting strategies:

 --wpb     : Henikoff position-based weights  [default]
 --wgsc    : Gerstein/Sonnhammer/Chothia tree weights
 --wblosum : Henikoff simple filter weights
 --wnone   : don't do any relative weighting; set all to 1
 --wgiven  : use weights as given in MSA file
 --wid <x> : for --wblosum: set identity cutoff  [0.62]  (0<=x<=1)

Alternative effective sequence weighting strategies:

 --eent       : adjust eff seq # to achieve relative entropy target  [default]
 --eclust     : eff seq # is # of single linkage clusters
 --enone      : no effective seq # weighting: just use nseq
 --eset <x>   : set eff seq # for all models to <x>
 --ere <x>    : for --eent: set minimum rel entropy/position to <x>
 --esigma <x> : for --eent: set sigma param to <x>  [45.0]
 --eid <x>    : for --eclust: set fractional identity cutoff to <x>  [0.62]

Alternative prior strategies:

 --pnone    : don't use any prior; parameters are frequencies
 --plaplace : use a Laplace +1 prior

Handling single sequence inputs:

 --singlemx    : use substitution score matrix for single-sequence inputs
 --popen <x>   : gap open probability (with --singlemx)
 --pextend <x> : gap extend probability (with --singlemx)
 --mx <s>      : substitution score matrix (built-in matrices, with --singlemx)
 --mxfile <f>  : read substitution score matrix from file <f> (with --singlemx)

Control of E-value calibration:

 --EmL <n> : length of sequences for MSV Gumbel mu fit  [200]  (n>0)
 --EmN <n> : number of sequences for MSV Gumbel mu fit  [200]  (n>0)
 --EvL <n> : length of sequences for Viterbi Gumbel mu fit  [200]  (n>0)
 --EvN <n> : number of sequences for Viterbi Gumbel mu fit  [200]  (n>0)
 --EfL <n> : length of sequences for Forward exp tail tau fit  [100]  (n>0)
 --EfN <n> : number of sequences for Forward exp tail tau fit  [200]  (n>0)
 --Eft <x> : tail mass for Forward exponential tail tau fit  [0.04]  (0<x<1)

Other options:

 --cpu <n>          : number of parallel CPU workers for multithreads
 --stall            : arrest after start: for attaching debugger to process
 --informat <s>     : assert input alifile is in format <s> (no autodetect)
 --seed <n>         : set RNG seed to <n> (if 0: one-time arbitrary seed)  [42]
 --w_beta <x>       : tail mass at which window length is determined
 --w_length <n>     : window length
 --maxinsertlen <n> : pretend all inserts are length <= <n>


hmmconvert

  1. hmmconvert :: convert profile file to a HMMER format
  2. HMMER 3.1b1 (May 2013); http://hmmer.org/
  3. Copyright (C) 2013 Howard Hughes Medical Institute.
  4. Freely distributed under the GNU General Public License (GPLv3).
  5. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmconvert [-options] <hmmfile>

Options:

 -h           : show brief help on version and usage
 -a           : ascii:  output models in HMMER3 ASCII format  [default]
 -b           : binary: output models in HMMER3 binary format
 -2           : HMMER2: output backward compatible HMMER2 ASCII format (ls mode)
 --outfmt <s> : choose output legacy 3.x file formats by name, such as '3/a'


hemmit

  1. hmmemit :: sample sequence(s) from a profile HMM
  2. HMMER 3.1b1 (May 2013); http://hmmer.org/
  3. Copyright (C) 2013 Howard Hughes Medical Institute.
  4. Freely distributed under the GNU General Public License (GPLv3).
  5. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmemit [-options] <hmmfile (single)>

Common options are:

 -h     : show brief help on version and usage
 -o <f> : send sequence output to file <f>, not stdout
 -N <n> : number of seqs to sample  [1]  (n>0)

Options controlling what to emit:

 -a : emit alignment
 -c : emit simple majority-rule consensus sequence
 -C : emit fancier consensus sequence (req's --minl, --minu)
 -p : sample sequences from profile, not core model

Options controlling emission from profiles with -p:

 -L <n>      : set expected length from profile to <n>  [400]
 --local     : configure profile in multihit local mode  [default]
 --unilocal  : configure profile in unilocal mode
 --glocal    : configure profile in multihit glocal mode
 --uniglocal : configure profile in unihit glocal mode

Options controlling fancy consensus emission with -C:

 --minl <x> : show consensus as 'any' (X/N) unless >= this fraction  [0.0]
 --minu <x> : show consensus as upper case if >= this fraction  [0.0]

Other options::

 --seed <n> : set RNG seed to <n>  [0]  (n>=0)


hmmfetch

  1. hmmfetch :: retrieve profile HMM(s) from a file
  2. Easel h3.1b1 (May 2013)
  3. Copyright (C) 2013 Howard Hughes Medical Institute.
  4. Freely distributed under the Janelia Farm Software License.
  5. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmfetch [options] <hmmfile> <key> (retrieves HMM named <key>) Usage: hmmfetch [options] -f <hmmfile> <keyfile> (retrieves all HMMs in <keyfile>) Usage: hmmfetch [options] --index <hmmfile> (indexes <hmmfile>)

Options:

 -h      : help; show brief info on version and usage
 -f      : second cmdline arg is a file of names to retrieve
 -o <f>  : output HMM to file <f> instead of stdout
 -O      : output HMM to file named <key>
 --index : index the <hmmfile>, creating <hmmfile>.ssi


  1. hmmlogo :: given an hmm, produce data required to build an hmm logo
  2. HMMER 3.1b1 (May 2013); http://hmmer.org/
  3. Copyright (C) 2013 Howard Hughes Medical Institute.
  4. Freely distributed under the GNU General Public License (GPLv3).
  5. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmlogo <hmmfile> [options]

Options:

 -h                      : show brief help on version and usage
 --height_emission       : total height = relative entropy ; residue height = emission  (default)
 --height_positive_score : total height = relative entropy ; residue height = % of positive score
 --height_bits           : total height = sums of (pos|neg) scores; residue height = score
 --no_indel              : don't provide indel rate values

./hmmlogo

hmmpress

  1. hmmpress :: prepare an HMM database for faster hmmscan searches
  2. HMMER 3.1b1 (May 2013); http://hmmer.org/
  3. Copyright (C) 2013 Howard Hughes Medical Institute.
  4. Freely distributed under the GNU General Public License (GPLv3).
  5. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmpress [-options] <hmmfile>

Options:

 -h : show brief help on version and usage
 -f : force: overwrite any previous pressed files


hmmscan

  1. hmmscan :: search sequence(s) against a profile database
  2. HMMER 3.1b1 (May 2013); http://hmmer.org/
  3. Copyright (C) 2013 Howard Hughes Medical Institute.
  4. Freely distributed under the GNU General Public License (GPLv3).
  5. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmscan [-options] <hmmdb> <seqfile>

Basic options:

 -h : show brief help on version and usage

Options controlling output:

 -o <f>           : direct output to file <f>, not stdout
 --tblout <f>     : save parseable table of per-sequence hits to file <s>
 --domtblout <f>  : save parseable table of per-domain hits to file <s>
 --pfamtblout <f> : save table of hits and domains to file, in Pfam format <s>
 --acc            : prefer accessions over names in output
 --noali          : don't output alignments, so output is smaller
 --notextw        : unlimit ASCII text output line width
 --textw <n>      : set max width of ASCII text output lines  [120]  (n>=120)

Options controlling reporting thresholds:

 -E <x>     : report models <= this E-value threshold in output  [10.0]  (x>0)
 -T <x>     : report models >= this score threshold in output
 --domE <x> : report domains <= this E-value threshold in output  [10.0]  (x>0)
 --domT <x> : report domains >= this score cutoff in output

Options controlling inclusion (significance) thresholds:

 --incE <x>    : consider models <= this E-value threshold as significant
 --incT <x>    : consider models >= this score threshold as significant
 --incdomE <x> : consider domains <= this E-value threshold as significant
 --incdomT <x> : consider domains >= this score threshold as significant

Options for model-specific thresholding:

 --cut_ga : use profile's GA gathering cutoffs to set all thresholding
 --cut_nc : use profile's NC noise cutoffs to set all thresholding
 --cut_tc : use profile's TC trusted cutoffs to set all thresholding

Options controlling acceleration heuristics:

 --max    : Turn all heuristic filters off (less speed, more power)
 --F1 <x> : MSV threshold: promote hits w/ P <= F1  [0.02]
 --F2 <x> : Vit threshold: promote hits w/ P <= F2  [1e-3]
 --F3 <x> : Fwd threshold: promote hits w/ P <= F3  [1e-5]
 --nobias : turn off composition bias filter

Other expert options:

 --nonull2     : turn off biased composition score corrections
 -Z <x>        : set # of comparisons done, for E-value calculation
 --domZ <x>    : set # of significant seqs, for domain E-value calculation
 --seed <n>    : set RNG seed to <n> (if 0: one-time arbitrary seed)  [42]
 --qformat <s> : assert input <seqfile> is in format <s>: no autodetection
 --daemon      : run program as a daemon
 --cpu <n>     : number of parallel CPU workers to use for multithreads


hmmsearch

  1. hmmsearch :: search profile(s) against a sequence database
  2. HMMER 3.1b1 (May 2013); http://hmmer.org/
  3. Copyright (C) 2013 Howard Hughes Medical Institute.
  4. Freely distributed under the GNU General Public License (GPLv3).
  5. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmsearch [options] <hmmfile> <seqdb>

Basic options:

 -h : show brief help on version and usage

Options directing output:

 -o <f>           : direct output to file <f>, not stdout
 -A <f>           : save multiple alignment of all hits to file <s>
 --tblout <f>     : save parseable table of per-sequence hits to file <s>
 --domtblout <f>  : save parseable table of per-domain hits to file <s>
 --pfamtblout <f> : save table of hits and domains to file, in Pfam format <s>
 --acc            : prefer accessions over names in output
 --noali          : don't output alignments, so output is smaller
 --notextw        : unlimit ASCII text output line width
 --textw <n>      : set max width of ASCII text output lines  [120]  (n>=120)

Options controlling reporting thresholds:

 -E <x>     : report sequences <= this E-value threshold in output  [10.0]  (x>0)
 -T <x>     : report sequences >= this score threshold in output
 --domE <x> : report domains <= this E-value threshold in output  [10.0]  (x>0)
 --domT <x> : report domains >= this score cutoff in output

Options controlling inclusion (significance) thresholds:

 --incE <x>    : consider sequences <= this E-value threshold as significant
 --incT <x>    : consider sequences >= this score threshold as significant
 --incdomE <x> : consider domains <= this E-value threshold as significant
 --incdomT <x> : consider domains >= this score threshold as significant

Options controlling model-specific thresholding:

 --cut_ga : use profile's GA gathering cutoffs to set all thresholding
 --cut_nc : use profile's NC noise cutoffs to set all thresholding
 --cut_tc : use profile's TC trusted cutoffs to set all thresholding

Options controlling acceleration heuristics:

 --max    : Turn all heuristic filters off (less speed, more power)
 --F1 <x> : Stage 1 (MSV) threshold: promote hits w/ P <= F1  [0.02]
 --F2 <x> : Stage 2 (Vit) threshold: promote hits w/ P <= F2  [1e-3]
 --F3 <x> : Stage 3 (Fwd) threshold: promote hits w/ P <= F3  [1e-5]
 --nobias : turn off composition bias filter

Other expert options:

 --nonull2     : turn off biased composition score corrections
 -Z <x>        : set # of comparisons done, for E-value calculation
 --domZ <x>    : set # of significant seqs, for domain E-value calculation
 --seed <n>    : set RNG seed to <n> (if 0: one-time arbitrary seed)  [42]
 --tformat <s> : assert target <seqfile> is in format <s>: no autodetection
 --cpu <n>     : number of parallel CPU workers to use for multithreads


hmmsim

  1. hmmsim :: collect profile HMM score distributions on random sequences
  2. HMMER 3.1b1 (May 2013); http://hmmer.org/
  3. Copyright (C) 2013 Howard Hughes Medical Institute.
  4. Freely distributed under the GNU General Public License (GPLv3).
  5. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmsim [-options] <hmmfile>

Common options:

 -h     : show brief help on version and usage
 -a     : obtain alignment length statistics too
 -v     : verbose: print scores
 -L <n> : length of random target seqs  [100]  (n>0)
 -N <n> : number of random target seqs  [1000]  (n>0)

Output options (only in serial mode, for single HMM input):

 -o <f>      : direct output to file <f>, not stdout
 --afile <f> : output alignment lengths to file <f>
 --efile <f> : output E vs. E plots to <f> in xy format
 --ffile <f> : output filter fraction: # seqs passing P thresh
 --pfile <f> : output P(S>x) plots to <f> in xy format
 --xfile <f> : output bitscores as binary double vector to <f>

Alternative alignment styles :

 --fs : multihit local alignment  [default]
 --sw : unihit local alignment
 --ls : multihit glocal alignment
 --s  : unihit glocal alignment

Alternative scoring algorithms :

 --vit  : score seqs with the Viterbi algorithm  [default]
 --fwd  : score seqs with the Forward algorithm
 --hyb  : score seqs with the Hybrid algorithm
 --msv  : score seqs with the MSV algorithm
 --fast : use the optimized versions of the above

Controlling range of fitted tail masses :

 --tmin <x>    : set lower bound tail mass for fwd,island  [0.02]
 --tmax <x>    : set lower bound tail mass for fwd,island  [0.02]
 --tpoints <n> : set number of tail probs to try  [1]
 --tlinear     : use linear not log spacing of tail probs

Controlling E-value calibration :

 --EmL <n> : length of sequences for MSV Gumbel mu fit  [200]  (n>0)
 --EmN <n> : number of sequences for MSV Gumbel mu fit  [200]  (n>0)
 --EvL <n> : length of sequences for Viterbi Gumbel mu fit  [200]  (n>0)
 --EvN <n> : number of sequences for Viterbi Gumbel mu fit  [200]  (n>0)
 --EfL <n> : length of sequences for Forward exp tail tau fit  [100]  (n>0)
 --EfN <n> : number of sequences for Forward exp tail tau fit  [200]  (n>0)
 --Eft <x> : tail mass for Forward exponential tail tau fit  [0.04]  (0<x<1)

Debugging :

 --stall    : arrest after start: for debugging MPI under gdb
 --seed <n> : set random number seed to <n>  [0]

Experiments :

 --bgflat           : set uniform background frequencies
 --bgcomp           : set bg frequencies to model's average composition
 --x-no-lengthmodel : turn the H3 length model off
 --nu <x>           : set nu parameter (# expected HSPs) for GMSV  [2.0]
 --pthresh <x>      : set P-value threshold for --ffile  [0.02]


hmmstat

  1. hmmstat :: display summary statistics for a profile file
  2. HMMER 3.1b1 (May 2013); http://hmmer.org/
  3. Copyright (C) 2013 Howard Hughes Medical Institute.
  4. Freely distributed under the GNU General Public License (GPLv3).
  5. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmstat [-options] <hmmfile>

Options:

 -h           : show brief help on version and usage
 --eval2score : compute score for E-value (E) for database of (Z) sequences
 --score2eval : compute E-value for score (S) for database of (Z) sequences
 -Z <n>       : database size (in seqs) for --eval2score or --score2eval
 --baseZ <n>  : database size (M bases) (DNA only, if search on both strands)
 --baseZ1 <n> : database size (M bases) (DNA only, if search on single strand)
 -E <x>       : E-value threshold, for --eval2score
 -S <x>       : Score input for --score2eval


hmmpgmd

  1. hmmpgmd :: search a query against a database
  2. HMMER 3.1b1 (May 2013); http://hmmer.org/
  3. Copyright (C) 2013 Howard Hughes Medical Institute.
  4. Freely distributed under the GNU General Public License (GPLv3).
  5. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmpgmd [options]

Basic options:

 -h : show brief help on version and usage

Other expert options:

 --master     : run program as the master server
 --worker <s> : run program as a worker with server at <s>
 --cport <n>  : port to use for client/server communication  [51371]
 --wport <n>  : port to use for server/worker communication  [51372]
 --ccncts <n> : maximum number of client side connections to accept  [16]
 --wcncts <n> : maximum number of worker side connections to accept  [32]
 --pid <f>    : file to write process id to
 --daemon     : run as a daemon using config file: /etc/hmmpgmd.conf
 --seqdb <f>  : protein database to cache for searches
 --hmmdb <f>  : hmm database to cache for searches
 --cpu <n>    : number of parallel CPU workers to use for multithreads


hmmc2

Usage: ./hmmc2 [-i addr] [-p port] [-A] [-S]

   -S      : print sequence scores
   -A      : print sequence alignments
   -i addr : ip address running daemon (default: 127.0.0.1)
   -p port : port daemon listens to clients on (default: 51371)


hmmpgmd2msa_utest

  1. hmmpgmd2msa_utest :: given an hmm, produce data required to build an hmm logo
  2. HMMER 3.1b1 (May 2013); http://hmmer.org/
  3. Copyright (C) 2013 Howard Hughes Medical Institute.
  4. Freely distributed under the GNU General Public License (GPLv3).
  5. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmpgmd2msa_utest <hmmfile> [options]

Options:

 -h                      : show brief help on version and usage
 --height_emission       : total height = relative entropy ; residue height = emission  (default)
 --height_positive_score : total height = relative entropy ; residue height = % of positive score
 --height_bits           : total height = sums of (pos|neg) scores; residue height = score
 --no_indel              : don't provide indel rate values

./hmmpgmd2msa_utest

More information on how to run the HMMER program can be found at the HMMER website