EM Algorithm EM算法

EM algorithm is a iterative optimization algorithm. EM is the abbreviation of expectation maximization. EM do not apply for all the optimization problems.

Remember! EM always converge but it can be a local optima.

EM applies for GMM, K-Means clustering, HMM, etc.

Jensen’s Inequality

convex function $f, f’’ > 0$; concave function $f, f’’<0$
$E[f(x)] \geq f(E[x]) \ \ \ \ \ f$ is convex
$E[f(x)] \leq f(E[x]) \ \ \ \ \ f$ is concave
If and only if $x$ is constant equal sign is taken.
图片来源
在这里插入图片描述

EM

${z_i}_{i=1\sim N}$ — $x_i$ belongs to $(z_i)^{th}$ distribution
$Q_i(z_i)$ — the probability distribution of $z_i$
(You can also turn $z_i$ into a vector and each dimension represents the probability.)

MLE:
$\max E(\theta)$
$E(\theta) = \prod\limits_{i=1}^N P(x_i|\theta) = \prod\limits_{i=1}^N [\sum\limits_{z_i}P(x_i, z_i | \theta)]$
$l(\theta) = \sum\limits_{i=1}^N\log[\sum\limits_{z_i} P(x_i,z_i|\theta)]$

$\theta_i$ — the parameter of $i^{th}$ iteration

randomize theta_0
while (!converge)
{
$$
\ \ \ \ \ \ \ \ E-step: \ \
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ Q_i(z_i) = \frac{P(x_i, z_i|\theta_k)}{\sum\limits_{z_i} P(x_i,z_i|\theta_k)} \
\ \ \ \ \ \ \ \ M-step: \ \
\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \theta_{k+1} = \arg\max\limits_{\theta}\sum\limits_{i=1}^N\sum\limits_{z_i} Q_i(z_i)\log \frac{P(x_i,z_i|\theta)}{Q_i(z_i)}\
$$
}

Jensen’s inequality applies in M-step and the prove of convergence.