Pair Hidden Markov Models

Pair HMMs are an extension to classical HMMs allowing each state not only to emit one symbol but a column of an alignment instead.
We assign an alphabet and two offsets to each state. The offsets determine the number of characters that is read from each sequence. This implies that each state Si has a distribution over |alphabet(Si )| possible emissions if one of the offsets is zero and over |alphabet(S i )|2 emissions if both offsets are non-zero.
The pair HMM is mainly specified by a graphML based XML file which can be partly created using the HMMEd. The emission probabilities and the alphabets have to be defined manually in the XML file.
Applications include probabilistic sequence alignment and comparative genefinding. The driving force behind those is the Viterbi algorithm.
The project was developed by Matthias Heinig during his Master thesis at Genoscope as the basis of a pair HMM based genefinder. The work was supervised by:

Jean-Marc Aury (Genoscope)
Vincent Schachter (Genoscope)
Alexander Schliep (MPI)

Features of the pair HMM implementation in ghmm:

easy definition of pair HMMs using the HMMEd and XML
Viterbi algorithm
Linear memory implementation of the Viterbi algorithm
Integration in the ghmm python interface