The HMM base class.
All functions where the C signatures allows it will
be defined in here. Unfortunately there stil is a lot of overloading going on in
derived classes. Generic features (these apply to all derived classes):
- Forward algorithm
- Viterbi algorithm
- Baum-Welch training
- HMM distance metric
- ...
Methods
|
|
|
|
__del__
|
__del__ ( self )
Deallocation routine for the underlying C data structures.
|
|
__init__
|
__init__ (
self,
emissionDomain,
distribution,
cmodel,
)
|
|
backward
|
backward (
self,
emissionSequence,
scalingVector,
)
Result: the (N x T)-matrix containing the backward-variables
Exceptions
|
|
TypeError, "EmissionSequence required, got " + str( emissionSequence.__class__.__name__ )
|
|
|
baumWelch
|
baumWelch (
self,
trainingSequences,
nrSteps,
loglikelihoodCutoff,
)
The functions for model training are defined in the derived classes.
|
|
baumWelchDelete
|
baumWelchDelete ( self )
|
|
baumWelchSetup
|
baumWelchSetup (
self,
trainingSequences,
nrSteps,
)
|
|
baumWelchStep
|
baumWelchStep (
self,
nrSteps,
loglikelihoodCutoff,
)
|
|
distance
|
distance (
self,
model,
seqLength,
)
Returns the distance between self.cmodel and model .
extern double smodel_prob_distance(smodel cm0, smodel cm, int maxT, int symmetric, int verbose);
|
|
forward
|
forward ( self, emissionSequence )
Result: the (N x T)-matrix containing the forward-variables
and the scaling vector
Exceptions
|
|
TypeError, "EmissionSequence required, got " + str( emissionSequence.__class__.__name__ )
|
|
|
getEmission
|
getEmission ( self, i )
Accessor function for the emission distribution parameters of state i .
For discrete models the distribution over the symbols is returned,
for continous models a matrix of the form
[ [mu_1, sigma_1, weight_1] ... [mu_M, sigma_M, weight_M] ] is returned.
|
|
getInitial
|
getInitial ( self, i )
Accessor function for the initial probability \pi_i
|
|
getTransition
|
getTransition (
self,
i,
j,
)
Accessor function for the transition a_ij
|
|
loglikelihood
|
loglikelihood ( self, emissionSequences )
Compute log( P[emissionSequences| model]) using the forward algorithm
assuming independence of the sequences in emissionSequences emissionSequences can either be a SequenceSet or a Sequence
Result: log( P[emissionSequences| model]) of type float which is
computed as \sum_{s} log( P[s| model]) when emissionSequences
is a SequenceSet
Note: The implementation does not compute the full forward matrix since we are only interested
in the likelihoods in this case.
Exceptions
|
|
TypeError, "EmissionSequence or SequenceSet required, got " + str( emissionSequences.__class__.__name__ )
|
|
|
loglikelihoods
|
loglikelihoods ( self, emissionSequences )
Compute a vector ( log( P[s| model]) )_{s} of log-likelihoods of the
individual emission_sequences using the forward algorithm emission_sequences is of type SequenceSet
Result: log( P[emissionSequences| model]) of type float
(numarray) vector of floats
XXX better name ? not as similar to "loglikelihood" XXX
Exceptions
|
|
TypeError, "EmissionSequence or SequenceSet required, got " + str( emissionSequences.__class__.__name__ )
|
|
|
logprob
|
logprob (
self,
emissionSequence,
stateSequence,
)
log P[ emissionSequence, stateSequence| m]
Defined in derived classes.
|
|
normalize
|
normalize ( self )
Normalize transition probs, emission probs (if applicable)
Defined in derived classes.
|
|
pathPosterior
|
pathPosterior (
self,
sequence,
path,
)
Returns the log posterior probability for path having generated sequence .
CAVEAT: statePosterior needs to calculate the complete forward and
backward matrices. If you are interested in multiple paths it would
be more efficient to use the posterior function directly and not
multiple calls to pathPosterior
|
|
posterior
|
posterior ( self, sequence )
Posterior distribution matrix for sequence .
|
|
printtypes
|
printtypes ( self, model_type )
|
|
randomize
|
randomize ( self, noiseLevel )
|
|
sample
|
sample (
self,
seqNr,
T,
seed=0,
)
Sample emission sequences
seqNr = number of sequences to be sampled
T = length of each sequence
seed = initialization value for rng, default 0 means
|
|
sampleSingle
|
sampleSingle (
self,
T,
seed=0,
)
Sample a single emission sequence of length at most T.
Returns a Sequence object.
|
|
setEmission
|
setEmission (
self,
i,
distributionParemters,
)
Set the emission distribution parameters
Defined in derived classes.
|
|
setInitial
|
setInitial (
self,
i,
prob,
fixProb=0,
)
Accessor function for the initial probability \pi_i
For fixProb = 1 \pi will be rescaled to 1 with pi[i] fixed to the
arguement value of prob .
|
|
setTransition
|
setTransition (
self,
i,
j,
prob,
)
Accessor function for the transition a_ij.
|
|
state
|
state ( self, stateLabel )
Given a stateLabel return the integer index to the state
(state labels not yet implemented)
|
|
statePosterior
|
statePosterior (
self,
sequence,
state,
time,
)
Return the log posterior probability for being at state at time time in sequence .
CAVEAT: statePosterior needs to calculate the complete forward and backward matrices. If
you are interested in multiple states it would be more efficient to use the posterior function
directly and not multiple calls to statePosterior
|
|
toMatrices
|
toMatrices ( self )
To be defined in derived classes.
|
|
viterbi
|
viterbi ( self, emissionSequences )
Compute the Viterbi-path for each sequence in emissionSequences
emission_sequences can either be a SequenceSet or an EmissionSequence
Result: [q_0, ..., q_T] the viterbi-path of emission_sequences is an emmissionSequence
object, [[q_0^0, ..., q_T^0], ..., [q_0^k, ..., q_T^k]} for a k-sequence
SequenceSet
Exceptions
|
|
TypeError, "EmissionSequence or SequenceSet required, got " + str( emissionSequences.__class__.__name__ )
|
|
|
write
|
write ( self, fileName )
Writes HMM to file fileName .
|
|