Detailed explanation of GMM-HMM speech recognition principle

This article briefly describes the principle, modeling and testing process of GMM-HMM in speech recognition.
1. What is the Hidden Markov Model?
Three issues that HMM has to solve:
1) Likelihood
2) Decoding
3) Training
2. What is GMM? How to use GMM to find the probability of a phoneme?
3. GMM+HMM Dafa solves speech recognition
3.1 Identification
3.2 Training
3.2.1 Training the params of GMM
3.2.2 Training the params of HMM
=========================================================== ==================
1. What is the Hidden Markov Model?
ANS: A Markov process with hidden nodes (unobservable) and visible nodes (see details).
The hidden node represents the state, and the visible node represents the voice we hear or the timing signal we see.
In the beginning, we specify the structure of this HMM. When training the HMM model: Given n timing signals y1...yT (training samples), estimate the parameters using MLE (typically implemented in EM):
1. Initial probability of N states
2. State transition probability a
3. Output probability b
--------------
In speech processing, a word consists of several phoneme (phonemes);
Each HMM corresponds to a word or phoneme (phoneme)
A word is represented as a number of states, and each state is represented as a phoneme.
There are three issues that need to be solved with HMM:
1) Likelihood: The probability that an HMM generates a sequence of observaTIon sequences x <the Forward algorithm>
Where Î±t(sj) indicates that the HMM is in state j at time t, and observaTIon = {x1,. . .,xt} probability
,
Aij is the transition probability of state i to state j,
Bj(xt) represents the probability of generating xt in state j,
2) Decoding: Given a sequence of observaTIon sequence x, find the most likely dependent HMM state sequence <the Viterbi algorithm>
In actual calculations, pruning is done. Instead of calculating the probability of each possible state sequence, use Viterbi approximaTIon:
From time 1:t, only the state and probability with the highest transition probability are recorded.
Let Vt(si) be the maximum probability that state j is the transition from all states at time t-1 to time t:
Remember It is: the state from the time t-1 to the time t is the highest probability of the state j;
The Viterbi approximation process is as follows:
Then based on the most likely transfer state sequence recorded Backtracking:
3) Training: Given an observation sequence x, train the HMM parameter Î» = {aij, bij} the EM (Forward-Backward) algorithm
In this part, we put it together with GMM training in "3. GMM+HMM Dafa to solve speech recognition"
-------------------------------------------------- -------------------

wires Headset, Wireless Headset, Headset
Headset
Wired Headset,Wireless Headset,Smart Wired Headset,Smart Wireless Headset
NINGBO SANCO ELECTRONICS CO., LTD. , https://www.sancobuzzer.com