Statistical physics, mixtures of distributions, and the EM algorithm
We show that there are strong relationships between approaches to optmization and learning based on statistical physics or mixtures of experts. In particular, the EM algorithm can be interpreted as converging either to a local maximum of the mixtures model or to a saddle point solution to the statistical physics system. An advantage of the statistical physics approach is that it naturally gives rise to a heuristic continuation method, deterministic annealing, for finding good solutions.
© 1994 Massachusetts Institute of Technology. Posted Online April 10, 2008. We would like to thank Eric Mjolsness and Anand Rangarajan for helpful conversations and encouragement. One of us (A.L.Y.) thanks DARPA and the Air Force for support under contract F49620-92-J-0466 and Geoffrey Hinton for a helpful conversation.
Published - YUInc94.pdf