Back
HMMER is made up of the following programs:
Contents |
hmmalign
- hmmalign :: align sequences to a profile HMM
- HMMER 3.1b1 (May 2013); http://hmmer.org/
- Copyright (C) 2013 Howard Hughes Medical Institute.
- Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Usage: hmmalign [-options] <hmmfile> <seqfile>
Basic options:
-h : show brief help on version and usage -o <f> : output alignment to file <f>, not stdout
Less common options:
--mapali <f> : include alignment in file <f> (same ali that HMM came from) --trim : trim terminal tails of nonaligned residues from alignment --amino : assert <seqfile>, <hmmfile> both protein: no autodetection --dna : assert <seqfile>, <hmmfile> both DNA: no autodetection --rna : assert <seqfile>, <hmmfile> both RNA: no autodetection --informat <s> : assert <seqfile> is in format <s>: no autodetection --outformat <s> : output alignment in format <s> [Stockholm]
Sequence input formats include: FASTA, EMBL, GenBank, UniProt Alignment output formats include: Stockholm, Pfam, A2M, PSIBLAST
hmmbuild
- hmmbuild :: profile HMM construction from multiple sequence alignments
- HMMER 3.1b1 (May 2013); http://hmmer.org/
- Copyright (C) 2013 Howard Hughes Medical Institute.
- Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Usage: hmmbuild [-options] <hmmfile_out> <msafile>
Basic options:
-h : show brief help on version and usage -n <s> : name the HMM <s> -o <f> : direct summary output to file <f>, not stdout -O <f> : resave annotated, possibly modified MSA to file <f>
Options for selecting alphabet rather than guessing it:
--amino : input alignment is protein sequence data --dna : input alignment is DNA sequence data --rna : input alignment is RNA sequence data
Alternative model construction strategies:
--fast : assign cols w/ >= symfrac residues as consensus [default] --hand : manual construction (requires reference annotation) --symfrac <x> : sets sym fraction controlling --fast construction [0.5] --fragthresh <x> : if L <= x*alen, tag sequence as a fragment [0.5]
Alternative relative sequence weighting strategies:
--wpb : Henikoff position-based weights [default] --wgsc : Gerstein/Sonnhammer/Chothia tree weights --wblosum : Henikoff simple filter weights --wnone : don't do any relative weighting; set all to 1 --wgiven : use weights as given in MSA file --wid <x> : for --wblosum: set identity cutoff [0.62] (0<=x<=1)
Alternative effective sequence weighting strategies:
--eent : adjust eff seq # to achieve relative entropy target [default] --eclust : eff seq # is # of single linkage clusters --enone : no effective seq # weighting: just use nseq --eset <x> : set eff seq # for all models to <x> --ere <x> : for --eent: set minimum rel entropy/position to <x> --esigma <x> : for --eent: set sigma param to <x> [45.0] --eid <x> : for --eclust: set fractional identity cutoff to <x> [0.62]
Alternative prior strategies:
--pnone : don't use any prior; parameters are frequencies --plaplace : use a Laplace +1 prior
Handling single sequence inputs:
--singlemx : use substitution score matrix for single-sequence inputs --popen <x> : gap open probability (with --singlemx) --pextend <x> : gap extend probability (with --singlemx) --mx <s> : substitution score matrix (built-in matrices, with --singlemx) --mxfile <f> : read substitution score matrix from file <f> (with --singlemx)
Control of E-value calibration:
--EmL <n> : length of sequences for MSV Gumbel mu fit [200] (n>0) --EmN <n> : number of sequences for MSV Gumbel mu fit [200] (n>0) --EvL <n> : length of sequences for Viterbi Gumbel mu fit [200] (n>0) --EvN <n> : number of sequences for Viterbi Gumbel mu fit [200] (n>0) --EfL <n> : length of sequences for Forward exp tail tau fit [100] (n>0) --EfN <n> : number of sequences for Forward exp tail tau fit [200] (n>0) --Eft <x> : tail mass for Forward exponential tail tau fit [0.04] (0<x<1)
Other options:
--cpu <n> : number of parallel CPU workers for multithreads --stall : arrest after start: for attaching debugger to process --informat <s> : assert input alifile is in format <s> (no autodetect) --seed <n> : set RNG seed to <n> (if 0: one-time arbitrary seed) [42] --w_beta <x> : tail mass at which window length is determined --w_length <n> : window length --maxinsertlen <n> : pretend all inserts are length <= <n>
hmmconvert
- hmmconvert :: convert profile file to a HMMER format
- HMMER 3.1b1 (May 2013); http://hmmer.org/
- Copyright (C) 2013 Howard Hughes Medical Institute.
- Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Usage: hmmconvert [-options] <hmmfile>
Options:
-h : show brief help on version and usage -a : ascii: output models in HMMER3 ASCII format [default] -b : binary: output models in HMMER3 binary format -2 : HMMER2: output backward compatible HMMER2 ASCII format (ls mode) --outfmt <s> : choose output legacy 3.x file formats by name, such as '3/a'
hemmit
- hmmemit :: sample sequence(s) from a profile HMM
- HMMER 3.1b1 (May 2013); http://hmmer.org/
- Copyright (C) 2013 Howard Hughes Medical Institute.
- Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Usage: hmmemit [-options] <hmmfile (single)>
Common options are:
-h : show brief help on version and usage -o <f> : send sequence output to file <f>, not stdout -N <n> : number of seqs to sample [1] (n>0)
Options controlling what to emit:
-a : emit alignment -c : emit simple majority-rule consensus sequence -C : emit fancier consensus sequence (req's --minl, --minu) -p : sample sequences from profile, not core model
Options controlling emission from profiles with -p:
-L <n> : set expected length from profile to <n> [400] --local : configure profile in multihit local mode [default] --unilocal : configure profile in unilocal mode --glocal : configure profile in multihit glocal mode --uniglocal : configure profile in unihit glocal mode
Options controlling fancy consensus emission with -C:
--minl <x> : show consensus as 'any' (X/N) unless >= this fraction [0.0] --minu <x> : show consensus as upper case if >= this fraction [0.0]
Other options::
--seed <n> : set RNG seed to <n> [0] (n>=0)
hmmfetch
- hmmfetch :: retrieve profile HMM(s) from a file
- Easel h3.1b1 (May 2013)
- Copyright (C) 2013 Howard Hughes Medical Institute.
- Freely distributed under the Janelia Farm Software License.
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Usage: hmmfetch [options] <hmmfile> <key> (retrieves HMM named <key>) Usage: hmmfetch [options] -f <hmmfile> <keyfile> (retrieves all HMMs in <keyfile>) Usage: hmmfetch [options] --index <hmmfile> (indexes <hmmfile>)
Options:
-h : help; show brief info on version and usage -f : second cmdline arg is a file of names to retrieve -o <f> : output HMM to file <f> instead of stdout -O : output HMM to file named <key> --index : index the <hmmfile>, creating <hmmfile>.ssi
hmmlogo
- hmmlogo :: given an hmm, produce data required to build an hmm logo
- HMMER 3.1b1 (May 2013); http://hmmer.org/
- Copyright (C) 2013 Howard Hughes Medical Institute.
- Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Usage: hmmlogo <hmmfile> [options]
Options:
-h : show brief help on version and usage --height_emission : total height = relative entropy ; residue height = emission (default) --height_positive_score : total height = relative entropy ; residue height = % of positive score --height_bits : total height = sums of (pos|neg) scores; residue height = score --no_indel : don't provide indel rate values
./hmmlogo
hmmpress
- hmmpress :: prepare an HMM database for faster hmmscan searches
- HMMER 3.1b1 (May 2013); http://hmmer.org/
- Copyright (C) 2013 Howard Hughes Medical Institute.
- Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Usage: hmmpress [-options] <hmmfile>
Options:
-h : show brief help on version and usage -f : force: overwrite any previous pressed files
hmmscan
- hmmscan :: search sequence(s) against a profile database
- HMMER 3.1b1 (May 2013); http://hmmer.org/
- Copyright (C) 2013 Howard Hughes Medical Institute.
- Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Usage: hmmscan [-options] <hmmdb> <seqfile>
Basic options:
-h : show brief help on version and usage
Options controlling output:
-o <f> : direct output to file <f>, not stdout --tblout <f> : save parseable table of per-sequence hits to file <s> --domtblout <f> : save parseable table of per-domain hits to file <s> --pfamtblout <f> : save table of hits and domains to file, in Pfam format <s> --acc : prefer accessions over names in output --noali : don't output alignments, so output is smaller --notextw : unlimit ASCII text output line width --textw <n> : set max width of ASCII text output lines [120] (n>=120)
Options controlling reporting thresholds:
-E <x> : report models <= this E-value threshold in output [10.0] (x>0) -T <x> : report models >= this score threshold in output --domE <x> : report domains <= this E-value threshold in output [10.0] (x>0) --domT <x> : report domains >= this score cutoff in output
Options controlling inclusion (significance) thresholds:
--incE <x> : consider models <= this E-value threshold as significant --incT <x> : consider models >= this score threshold as significant --incdomE <x> : consider domains <= this E-value threshold as significant --incdomT <x> : consider domains >= this score threshold as significant
Options for model-specific thresholding:
--cut_ga : use profile's GA gathering cutoffs to set all thresholding --cut_nc : use profile's NC noise cutoffs to set all thresholding --cut_tc : use profile's TC trusted cutoffs to set all thresholding
Options controlling acceleration heuristics:
--max : Turn all heuristic filters off (less speed, more power) --F1 <x> : MSV threshold: promote hits w/ P <= F1 [0.02] --F2 <x> : Vit threshold: promote hits w/ P <= F2 [1e-3] --F3 <x> : Fwd threshold: promote hits w/ P <= F3 [1e-5] --nobias : turn off composition bias filter
Other expert options:
--nonull2 : turn off biased composition score corrections -Z <x> : set # of comparisons done, for E-value calculation --domZ <x> : set # of significant seqs, for domain E-value calculation --seed <n> : set RNG seed to <n> (if 0: one-time arbitrary seed) [42] --qformat <s> : assert input <seqfile> is in format <s>: no autodetection --daemon : run program as a daemon --cpu <n> : number of parallel CPU workers to use for multithreads
hmmsearch
- hmmsearch :: search profile(s) against a sequence database
- HMMER 3.1b1 (May 2013); http://hmmer.org/
- Copyright (C) 2013 Howard Hughes Medical Institute.
- Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Usage: hmmsearch [options] <hmmfile> <seqdb>
Basic options:
-h : show brief help on version and usage
Options directing output:
-o <f> : direct output to file <f>, not stdout -A <f> : save multiple alignment of all hits to file <s> --tblout <f> : save parseable table of per-sequence hits to file <s> --domtblout <f> : save parseable table of per-domain hits to file <s> --pfamtblout <f> : save table of hits and domains to file, in Pfam format <s> --acc : prefer accessions over names in output --noali : don't output alignments, so output is smaller --notextw : unlimit ASCII text output line width --textw <n> : set max width of ASCII text output lines [120] (n>=120)
Options controlling reporting thresholds:
-E <x> : report sequences <= this E-value threshold in output [10.0] (x>0) -T <x> : report sequences >= this score threshold in output --domE <x> : report domains <= this E-value threshold in output [10.0] (x>0) --domT <x> : report domains >= this score cutoff in output
Options controlling inclusion (significance) thresholds:
--incE <x> : consider sequences <= this E-value threshold as significant --incT <x> : consider sequences >= this score threshold as significant --incdomE <x> : consider domains <= this E-value threshold as significant --incdomT <x> : consider domains >= this score threshold as significant
Options controlling model-specific thresholding:
--cut_ga : use profile's GA gathering cutoffs to set all thresholding --cut_nc : use profile's NC noise cutoffs to set all thresholding --cut_tc : use profile's TC trusted cutoffs to set all thresholding
Options controlling acceleration heuristics:
--max : Turn all heuristic filters off (less speed, more power) --F1 <x> : Stage 1 (MSV) threshold: promote hits w/ P <= F1 [0.02] --F2 <x> : Stage 2 (Vit) threshold: promote hits w/ P <= F2 [1e-3] --F3 <x> : Stage 3 (Fwd) threshold: promote hits w/ P <= F3 [1e-5] --nobias : turn off composition bias filter
Other expert options:
--nonull2 : turn off biased composition score corrections -Z <x> : set # of comparisons done, for E-value calculation --domZ <x> : set # of significant seqs, for domain E-value calculation --seed <n> : set RNG seed to <n> (if 0: one-time arbitrary seed) [42] --tformat <s> : assert target <seqfile> is in format <s>: no autodetection --cpu <n> : number of parallel CPU workers to use for multithreads
hmmsim
- hmmsim :: collect profile HMM score distributions on random sequences
- HMMER 3.1b1 (May 2013); http://hmmer.org/
- Copyright (C) 2013 Howard Hughes Medical Institute.
- Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Usage: hmmsim [-options] <hmmfile>
Common options:
-h : show brief help on version and usage -a : obtain alignment length statistics too -v : verbose: print scores -L <n> : length of random target seqs [100] (n>0) -N <n> : number of random target seqs [1000] (n>0)
Output options (only in serial mode, for single HMM input):
-o <f> : direct output to file <f>, not stdout --afile <f> : output alignment lengths to file <f> --efile <f> : output E vs. E plots to <f> in xy format --ffile <f> : output filter fraction: # seqs passing P thresh --pfile <f> : output P(S>x) plots to <f> in xy format --xfile <f> : output bitscores as binary double vector to <f>
Alternative alignment styles :
--fs : multihit local alignment [default] --sw : unihit local alignment --ls : multihit glocal alignment --s : unihit glocal alignment
Alternative scoring algorithms :
--vit : score seqs with the Viterbi algorithm [default] --fwd : score seqs with the Forward algorithm --hyb : score seqs with the Hybrid algorithm --msv : score seqs with the MSV algorithm --fast : use the optimized versions of the above
Controlling range of fitted tail masses :
--tmin <x> : set lower bound tail mass for fwd,island [0.02] --tmax <x> : set lower bound tail mass for fwd,island [0.02] --tpoints <n> : set number of tail probs to try [1] --tlinear : use linear not log spacing of tail probs
Controlling E-value calibration :
--EmL <n> : length of sequences for MSV Gumbel mu fit [200] (n>0) --EmN <n> : number of sequences for MSV Gumbel mu fit [200] (n>0) --EvL <n> : length of sequences for Viterbi Gumbel mu fit [200] (n>0) --EvN <n> : number of sequences for Viterbi Gumbel mu fit [200] (n>0) --EfL <n> : length of sequences for Forward exp tail tau fit [100] (n>0) --EfN <n> : number of sequences for Forward exp tail tau fit [200] (n>0) --Eft <x> : tail mass for Forward exponential tail tau fit [0.04] (0<x<1)
Debugging :
--stall : arrest after start: for debugging MPI under gdb --seed <n> : set random number seed to <n> [0]
Experiments :
--bgflat : set uniform background frequencies --bgcomp : set bg frequencies to model's average composition --x-no-lengthmodel : turn the H3 length model off --nu <x> : set nu parameter (# expected HSPs) for GMSV [2.0] --pthresh <x> : set P-value threshold for --ffile [0.02]
hmmstat
- hmmstat :: display summary statistics for a profile file
- HMMER 3.1b1 (May 2013); http://hmmer.org/
- Copyright (C) 2013 Howard Hughes Medical Institute.
- Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Usage: hmmstat [-options] <hmmfile>
Options:
-h : show brief help on version and usage --eval2score : compute score for E-value (E) for database of (Z) sequences --score2eval : compute E-value for score (S) for database of (Z) sequences -Z <n> : database size (in seqs) for --eval2score or --score2eval --baseZ <n> : database size (M bases) (DNA only, if search on both strands) --baseZ1 <n> : database size (M bases) (DNA only, if search on single strand) -E <x> : E-value threshold, for --eval2score -S <x> : Score input for --score2eval
hmmpgmd
- hmmpgmd :: search a query against a database
- HMMER 3.1b1 (May 2013); http://hmmer.org/
- Copyright (C) 2013 Howard Hughes Medical Institute.
- Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Usage: hmmpgmd [options]
Basic options:
-h : show brief help on version and usage
Other expert options:
--master : run program as the master server --worker <s> : run program as a worker with server at <s> --cport <n> : port to use for client/server communication [51371] --wport <n> : port to use for server/worker communication [51372] --ccncts <n> : maximum number of client side connections to accept [16] --wcncts <n> : maximum number of worker side connections to accept [32] --pid <f> : file to write process id to --daemon : run as a daemon using config file: /etc/hmmpgmd.conf --seqdb <f> : protein database to cache for searches --hmmdb <f> : hmm database to cache for searches --cpu <n> : number of parallel CPU workers to use for multithreads
hmmc2
Usage: ./hmmc2 [-i addr] [-p port] [-A] [-S]
-S : print sequence scores -A : print sequence alignments -i addr : ip address running daemon (default: 127.0.0.1) -p port : port daemon listens to clients on (default: 51371)
hmmpgmd2msa_utest
- hmmpgmd2msa_utest :: given an hmm, produce data required to build an hmm logo
- HMMER 3.1b1 (May 2013); http://hmmer.org/
- Copyright (C) 2013 Howard Hughes Medical Institute.
- Freely distributed under the GNU General Public License (GPLv3).
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Usage: hmmpgmd2msa_utest <hmmfile> [options]
Options:
-h : show brief help on version and usage --height_emission : total height = relative entropy ; residue height = emission (default) --height_positive_score : total height = relative entropy ; residue height = % of positive score --height_bits : total height = sums of (pos|neg) scores; residue height = score --no_indel : don't provide indel rate values
./hmmpgmd2msa_utest
More information on how to run the HMMER program can be found at the HMMER website