HMM
Hidden Markov Model 隐马尔可夫模型
(todo)
- sentence $\lambda = \lambda_1\lambda_2…\lambda_n$
- predicted label $o=o_1o_2…o_n$
- output = max $P(o_1o_2…o_n|\lambda_1\lambda_2…\lambda_n)$
if all the words are independent: (Markov Assumption)
$$
P(o_1o_2…o_n|\lambda_1\lambda_2…\lambda_n) = P(o_1|\lambda_1)P(o_2|\lambda_2)…P(o_n|\lambda_n)
$$
$$
P(o|\lambda) = \frac{P(\lambda | o)P(o)}{P(\lambda)}
$$
$P(\lambda)$ is constant, so:
$$
argmax{P(o|\lambda)} = argmax{P(\lambda | o)P(o)}
$$
for $P(\lambda | o)$ : Markov Assumption
$$
P(\lambda | o) = P(\lambda_1 | o_1)P(\lambda_2 | o_2)…P(\lambda_n | o_n)
$$
for $P(o)$ :
$$
P(o) = P(o_1)P(o_2|o_1)P(o_3|o_1, o_2)…P(o_n|o_1, o_2, …, o_n) \
$$
Homogeneous Markov Assumption:
$$
P(o) = P(o_1)P(o_2|o_1)P(o_3|o_2)…P(o_n|o_{n-1})
$$
In general,
$$
P(\lambda | o)P(o) \sim P(\lambda_1 | o_1)P(o_2|o_1)P(\lambda_2 | o_2)P(o_3|o_2)…P(\lambda_n | o_n)P(o_n|o_{n-1})
$$
This is actually bigram model!
HMM五元组:
- 观测序列
- 状态序列
- 初始状态概率
- 状态转移矩阵 $P(o_k|o_{k-1})$
- 观测概率矩阵(发射概率) $P(\lambda_k|o_k)$
两个假设:
- 观测独立性假设(马尔科夫假设) 每个字的输出仅和当前字有关
- 齐次马尔科夫假设 每个输出仅和上一个输出有关
实际中使用动态规划的方法求解$max{P(\lambda | o)P(o)}$, 如Veterbi算法。