HMMER


Back

HMMER is made up of the following programs:

 --mapali <f>    : include alignment in file <f> (same ali that HMM came from)
 --trim          : trim terminal tails of nonaligned residues from alignment
 --amino         : assert <seqfile>, <hmmfile> both protein: no autodetection
 --dna           : assert <seqfile>, <hmmfile> both DNA: no autodetection
 --rna           : assert <seqfile>, <hmmfile> both RNA: no autodetection
 --informat <s>  : assert <seqfile> is in format <s>: no autodetection
 --outformat <s> : output alignment in format <s>  [Stockholm]

Sequence input formats include: FASTA, EMBL, GenBank, UniProt Alignment output formats include: Stockholm, Pfam, A2M, PSIBLAST

hmmbuild

hmmbuild :: profile HMM construction from multiple sequence alignments
HMMER 3.1b1 (May 2013); http://hmmer.org/
Copyright (C) 2013 Howard Hughes Medical Institute.
Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmbuild [-options] <hmmfile_out> <msafile>

Basic options:

 -h     : show brief help on version and usage
 -n <s> : name the HMM <s>
 -o <f> : direct summary output to file <f>, not stdout
 -O <f> : resave annotated, possibly modified MSA to file <f>

Options for selecting alphabet rather than guessing it:

 --amino : input alignment is protein sequence data
 --dna   : input alignment is DNA sequence data
 --rna   : input alignment is RNA sequence data

Alternative model construction strategies:

 --fast           : assign cols w/ >= symfrac residues as consensus  [default]
 --hand           : manual construction (requires reference annotation)
 --symfrac <x>    : sets sym fraction controlling --fast construction  [0.5]
 --fragthresh <x> : if L <= x*alen, tag sequence as a fragment  [0.5]

Alternative relative sequence weighting strategies:

 --wpb     : Henikoff position-based weights  [default]
 --wgsc    : Gerstein/Sonnhammer/Chothia tree weights
 --wblosum : Henikoff simple filter weights
 --wnone   : don't do any relative weighting; set all to 1
 --wgiven  : use weights as given in MSA file
 --wid <x> : for --wblosum: set identity cutoff  [0.62]  (0<=x<=1)

Alternative effective sequence weighting strategies:

 --eent       : adjust eff seq # to achieve relative entropy target  [default]
 --eclust     : eff seq # is # of single linkage clusters
 --enone      : no effective seq # weighting: just use nseq
 --eset <x>   : set eff seq # for all models to <x>
 --ere <x>    : for --eent: set minimum rel entropy/position to <x>
 --esigma <x> : for --eent: set sigma param to <x>  [45.0]
 --eid <x>    : for --eclust: set fractional identity cutoff to <x>  [0.62]

Alternative prior strategies:

 --pnone    : don't use any prior; parameters are frequencies
 --plaplace : use a Laplace +1 prior

Handling single sequence inputs:

 --singlemx    : use substitution score matrix for single-sequence inputs
 --popen <x>   : gap open probability (with --singlemx)
 --pextend <x> : gap extend probability (with --singlemx)
 --mx <s>      : substitution score matrix (built-in matrices, with --singlemx)
 --mxfile <f>  : read substitution score matrix from file <f> (with --singlemx)

Control of E-value calibration:

 --EmL <n> : length of sequences for MSV Gumbel mu fit  [200]  (n>0)
 --EmN <n> : number of sequences for MSV Gumbel mu fit  [200]  (n>0)
 --EvL <n> : length of sequences for Viterbi Gumbel mu fit  [200]  (n>0)
 --EvN <n> : number of sequences for Viterbi Gumbel mu fit  [200]  (n>0)
 --EfL <n> : length of sequences for Forward exp tail tau fit  [100]  (n>0)
 --EfN <n> : number of sequences for Forward exp tail tau fit  [200]  (n>0)
 --Eft <x> : tail mass for Forward exponential tail tau fit  [0.04]  (0<x<1)

Other options:

 --cpu <n>          : number of parallel CPU workers for multithreads
 --stall            : arrest after start: for attaching debugger to process
 --informat <s>     : assert input alifile is in format <s> (no autodetect)
 --seed <n>         : set RNG seed to <n> (if 0: one-time arbitrary seed)  [42]
 --w_beta <x>       : tail mass at which window length is determined
 --w_length <n>     : window length
 --maxinsertlen <n> : pretend all inserts are length <= <n>

hmmconvert

hmmconvert :: convert profile file to a HMMER format
HMMER 3.1b1 (May 2013); http://hmmer.org/
Copyright (C) 2013 Howard Hughes Medical Institute.
Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmconvert [-options] <hmmfile>

Options:

 -h           : show brief help on version and usage
 -a           : ascii:  output models in HMMER3 ASCII format  [default]
 -b           : binary: output models in HMMER3 binary format
 -2           : HMMER2: output backward compatible HMMER2 ASCII format (ls mode)
 --outfmt <s> : choose output legacy 3.x file formats by name, such as '3/a'

hemmit

hmmemit :: sample sequence(s) from a profile HMM
HMMER 3.1b1 (May 2013); http://hmmer.org/
Copyright (C) 2013 Howard Hughes Medical Institute.
Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmemit [-options] <hmmfile (single)>

Common options are:

 -h     : show brief help on version and usage
 -o <f> : send sequence output to file <f>, not stdout
 -N <n> : number of seqs to sample  [1]  (n>0)

Options controlling what to emit:

 -a : emit alignment
 -c : emit simple majority-rule consensus sequence
 -C : emit fancier consensus sequence (req's --minl, --minu)
 -p : sample sequences from profile, not core model

Options controlling emission from profiles with -p:

 -L <n>      : set expected length from profile to <n>  [400]
 --local     : configure profile in multihit local mode  [default]
 --unilocal  : configure profile in unilocal mode
 --glocal    : configure profile in multihit glocal mode
 --uniglocal : configure profile in unihit glocal mode

Options controlling fancy consensus emission with -C:

 --minl <x> : show consensus as 'any' (X/N) unless >= this fraction  [0.0]
 --minu <x> : show consensus as upper case if >= this fraction  [0.0]

Other options::

 --seed <n> : set RNG seed to <n>  [0]  (n>=0)

hmmfetch

hmmfetch :: retrieve profile HMM(s) from a file
Easel h3.1b1 (May 2013)
Copyright (C) 2013 Howard Hughes Medical Institute.
Freely distributed under the Janelia Farm Software License.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmfetch [options] <hmmfile> <key> (retrieves HMM named <key>) Usage: hmmfetch [options] -f <hmmfile> <keyfile> (retrieves all HMMs in <keyfile>) Usage: hmmfetch [options] --index <hmmfile> (indexes <hmmfile>)

Options:

 -h      : help; show brief info on version and usage
 -f      : second cmdline arg is a file of names to retrieve
 -o <f>  : output HMM to file <f> instead of stdout
 -O      : output HMM to file named <key>
 --index : index the <hmmfile>, creating <hmmfile>.ssi

hmmlogo

hmmlogo :: given an hmm, produce data required to build an hmm logo
HMMER 3.1b1 (May 2013); http://hmmer.org/
Copyright (C) 2013 Howard Hughes Medical Institute.
Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmlogo <hmmfile> [options]

Options:

 -h                      : show brief help on version and usage
 --height_emission       : total height = relative entropy ; residue height = emission  (default)
 --height_positive_score : total height = relative entropy ; residue height = % of positive score
 --height_bits           : total height = sums of (pos|neg) scores; residue height = score
 --no_indel              : don't provide indel rate values

./hmmlogo

hmmpress

hmmpress :: prepare an HMM database for faster hmmscan searches
HMMER 3.1b1 (May 2013); http://hmmer.org/
Copyright (C) 2013 Howard Hughes Medical Institute.
Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmpress [-options] <hmmfile>

Options:

 -h : show brief help on version and usage
 -f : force: overwrite any previous pressed files

hmmscan

hmmscan :: search sequence(s) against a profile database
HMMER 3.1b1 (May 2013); http://hmmer.org/
Copyright (C) 2013 Howard Hughes Medical Institute.
Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmscan [-options] <hmmdb> <seqfile>

Basic options:

 -h : show brief help on version and usage

Options controlling output:

 -o <f>           : direct output to file <f>, not stdout
 --tblout <f>     : save parseable table of per-sequence hits to file <s>
 --domtblout <f>  : save parseable table of per-domain hits to file <s>
 --pfamtblout <f> : save table of hits and domains to file, in Pfam format <s>
 --acc            : prefer accessions over names in output
 --noali          : don't output alignments, so output is smaller
 --notextw        : unlimit ASCII text output line width
 --textw <n>      : set max width of ASCII text output lines  [120]  (n>=120)

Options controlling reporting thresholds:

 -E <x>     : report models <= this E-value threshold in output  [10.0]  (x>0)
 -T <x>     : report models >= this score threshold in output
 --domE <x> : report domains <= this E-value threshold in output  [10.0]  (x>0)
 --domT <x> : report domains >= this score cutoff in output

Options controlling inclusion (significance) thresholds:

 --incE <x>    : consider models <= this E-value threshold as significant
 --incT <x>    : consider models >= this score threshold as significant
 --incdomE <x> : consider domains <= this E-value threshold as significant
 --incdomT <x> : consider domains >= this score threshold as significant

Options for model-specific thresholding:

 --cut_ga : use profile's GA gathering cutoffs to set all thresholding
 --cut_nc : use profile's NC noise cutoffs to set all thresholding
 --cut_tc : use profile's TC trusted cutoffs to set all thresholding

Options controlling acceleration heuristics:

 --max    : Turn all heuristic filters off (less speed, more power)
 --F1 <x> : MSV threshold: promote hits w/ P <= F1  [0.02]
 --F2 <x> : Vit threshold: promote hits w/ P <= F2  [1e-3]
 --F3 <x> : Fwd threshold: promote hits w/ P <= F3  [1e-5]
 --nobias : turn off composition bias filter

Other expert options:

 --nonull2     : turn off biased composition score corrections
 -Z <x>        : set # of comparisons done, for E-value calculation
 --domZ <x>    : set # of significant seqs, for domain E-value calculation
 --seed <n>    : set RNG seed to <n> (if 0: one-time arbitrary seed)  [42]
 --qformat <s> : assert input <seqfile> is in format <s>: no autodetection
 --daemon      : run program as a daemon
 --cpu <n>     : number of parallel CPU workers to use for multithreads

hmmsearch

hmmsearch :: search profile(s) against a sequence database
HMMER 3.1b1 (May 2013); http://hmmer.org/
Copyright (C) 2013 Howard Hughes Medical Institute.
Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmsearch [options] <hmmfile> <seqdb>

Basic options:

 -h : show brief help on version and usage

Options directing output:

 -o <f>           : direct output to file <f>, not stdout
 -A <f>           : save multiple alignment of all hits to file <s>
 --tblout <f>     : save parseable table of per-sequence hits to file <s>
 --domtblout <f>  : save parseable table of per-domain hits to file <s>
 --pfamtblout <f> : save table of hits and domains to file, in Pfam format <s>
 --acc            : prefer accessions over names in output
 --noali          : don't output alignments, so output is smaller
 --notextw        : unlimit ASCII text output line width
 --textw <n>      : set max width of ASCII text output lines  [120]  (n>=120)

Options controlling reporting thresholds:

 -E <x>     : report sequences <= this E-value threshold in output  [10.0]  (x>0)
 -T <x>     : report sequences >= this score threshold in output
 --domE <x> : report domains <= this E-value threshold in output  [10.0]  (x>0)
 --domT <x> : report domains >= this score cutoff in output

Options controlling inclusion (significance) thresholds:

 --incE <x>    : consider sequences <= this E-value threshold as significant
 --incT <x>    : consider sequences >= this score threshold as significant
 --incdomE <x> : consider domains <= this E-value threshold as significant
 --incdomT <x> : consider domains >= this score threshold as significant

Options controlling model-specific thresholding:

 --cut_ga : use profile's GA gathering cutoffs to set all thresholding
 --cut_nc : use profile's NC noise cutoffs to set all thresholding
 --cut_tc : use profile's TC trusted cutoffs to set all thresholding

Options controlling acceleration heuristics:

 --max    : Turn all heuristic filters off (less speed, more power)
 --F1 <x> : Stage 1 (MSV) threshold: promote hits w/ P <= F1  [0.02]
 --F2 <x> : Stage 2 (Vit) threshold: promote hits w/ P <= F2  [1e-3]
 --F3 <x> : Stage 3 (Fwd) threshold: promote hits w/ P <= F3  [1e-5]
 --nobias : turn off composition bias filter

Other expert options:

 --nonull2     : turn off biased composition score corrections
 -Z <x>        : set # of comparisons done, for E-value calculation
 --domZ <x>    : set # of significant seqs, for domain E-value calculation
 --seed <n>    : set RNG seed to <n> (if 0: one-time arbitrary seed)  [42]
 --tformat <s> : assert target <seqfile> is in format <s>: no autodetection
 --cpu <n>     : number of parallel CPU workers to use for multithreads

hmmsim

hmmsim :: collect profile HMM score distributions on random sequences
HMMER 3.1b1 (May 2013); http://hmmer.org/
Copyright (C) 2013 Howard Hughes Medical Institute.
Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmsim [-options] <hmmfile>

Common options:

 -h     : show brief help on version and usage
 -a     : obtain alignment length statistics too
 -v     : verbose: print scores
 -L <n> : length of random target seqs  [100]  (n>0)
 -N <n> : number of random target seqs  [1000]  (n>0)

Output options (only in serial mode, for single HMM input):

 -o <f>      : direct output to file <f>, not stdout
 --afile <f> : output alignment lengths to file <f>
 --efile <f> : output E vs. E plots to <f> in xy format
 --ffile <f> : output filter fraction: # seqs passing P thresh
 --pfile <f> : output P(S>x) plots to <f> in xy format
 --xfile <f> : output bitscores as binary double vector to <f>

Alternative alignment styles :

 --fs : multihit local alignment  [default]
 --sw : unihit local alignment
 --ls : multihit glocal alignment
 --s  : unihit glocal alignment

Alternative scoring algorithms :

 --vit  : score seqs with the Viterbi algorithm  [default]
 --fwd  : score seqs with the Forward algorithm
 --hyb  : score seqs with the Hybrid algorithm
 --msv  : score seqs with the MSV algorithm
 --fast : use the optimized versions of the above

Controlling range of fitted tail masses :

 --tmin <x>    : set lower bound tail mass for fwd,island  [0.02]
 --tmax <x>    : set lower bound tail mass for fwd,island  [0.02]
 --tpoints <n> : set number of tail probs to try  [1]
 --tlinear     : use linear not log spacing of tail probs

Controlling E-value calibration :

 --EmL <n> : length of sequences for MSV Gumbel mu fit  [200]  (n>0)
 --EmN <n> : number of sequences for MSV Gumbel mu fit  [200]  (n>0)
 --EvL <n> : length of sequences for Viterbi Gumbel mu fit  [200]  (n>0)
 --EvN <n> : number of sequences for Viterbi Gumbel mu fit  [200]  (n>0)
 --EfL <n> : length of sequences for Forward exp tail tau fit  [100]  (n>0)
 --EfN <n> : number of sequences for Forward exp tail tau fit  [200]  (n>0)
 --Eft <x> : tail mass for Forward exponential tail tau fit  [0.04]  (0<x<1)

Debugging :

 --stall    : arrest after start: for debugging MPI under gdb
 --seed <n> : set random number seed to <n>  [0]

Experiments :

 --bgflat           : set uniform background frequencies
 --bgcomp           : set bg frequencies to model's average composition
 --x-no-lengthmodel : turn the H3 length model off
 --nu <x>           : set nu parameter (# expected HSPs) for GMSV  [2.0]
 --pthresh <x>      : set P-value threshold for --ffile  [0.02]

hmmstat

hmmstat :: display summary statistics for a profile file
HMMER 3.1b1 (May 2013); http://hmmer.org/
Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmstat [-options] <hmmfile>

Options:

 -h           : show brief help on version and usage
 --eval2score : compute score for E-value (E) for database of (Z) sequences
 --score2eval : compute E-value for score (S) for database of (Z) sequences
 -Z <n>       : database size (in seqs) for --eval2score or --score2eval
 --baseZ <n>  : database size (M bases) (DNA only, if search on both strands)
 --baseZ1 <n> : database size (M bases) (DNA only, if search on single strand)
 -E <x>       : E-value threshold, for --eval2score
 -S <x>       : Score input for --score2eval

hmmpgmd

hmmpgmd :: search a query against a database
HMMER 3.1b1 (May 2013); http://hmmer.org/
Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmpgmd [options]

Basic options:

 -h : show brief help on version and usage

Other expert options:

 --master     : run program as the master server
 --worker <s> : run program as a worker with server at <s>
 --cport <n>  : port to use for client/server communication  [51371]
 --wport <n>  : port to use for server/worker communication  [51372]
 --ccncts <n> : maximum number of client side connections to accept  [16]
 --wcncts <n> : maximum number of worker side connections to accept  [32]
 --pid <f>    : file to write process id to
 --daemon     : run as a daemon using config file: /etc/hmmpgmd.conf
 --seqdb <f>  : protein database to cache for searches
 --hmmdb <f>  : hmm database to cache for searches
 --cpu <n>    : number of parallel CPU workers to use for multithreads

hmmc2

Usage: ./hmmc2 [-i addr] [-p port] [-A] [-S]

   -S      : print sequence scores
   -A      : print sequence alignments
   -i addr : ip address running daemon (default: 127.0.0.1)
   -p port : port daemon listens to clients on (default: 51371)

hmmpgmd2msa_utest

hmmpgmd2msa_utest :: given an hmm, produce data required to build an hmm logo
HMMER 3.1b1 (May 2013); http://hmmer.org/
Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Usage: hmmpgmd2msa_utest <hmmfile> [options]

Options:

 -h                      : show brief help on version and usage
 --height_emission       : total height = relative entropy ; residue height = emission  (default)
 --height_positive_score : total height = relative entropy ; residue height = % of positive score
 --height_bits           : total height = sums of (pos|neg) scores; residue height = score
 --no_indel              : don't provide indel rate values

./hmmpgmd2msa_utest

More information on how to run the HMMER program can be found at the HMMER website

Software

Public Resources

Licensed Software

Contents

hmmalign

hmmbuild

hmmconvert

hemmit

hmmfetch

hmmlogo

hmmpress

hmmscan

hmmsearch

hmmsim

hmmstat

hmmpgmd

hmmc2

hmmpgmd2msa_utest