# Mixture Representations for Inference and Learning in Boltzmann Machines

Neil D. Lawrence, University of Sheffield
Christopher M. Bishop, Microsoft Research, Cambridge
Michael I. Jordan, UC Berkeley

in Uncertainty in Artificial Intelligence 14, pp 320-327

#### Abstract

Boltzmann machines are undirected graphical models with two-state stochastic variables, in which the logarithms of the clique potentials are quadratic functions of the node states. They have been widely studied in the neural computing literature, although their practical applicability has been limited by the difficulty of finding an effective learning algorithm. One well-established approach, known as mean field theory, represents the stochastic distribution using a factorized approximation. However, the corresponding learning algorithm often fails to find a good solution. We conjecture that this is due to the implicit uni-modality of the mean field approximation which is therefore unable to capture multi-modality in the true distribution. In this paper we use variational methods to approximate the stochastic distribution using multi-modal mixtures of factorized distributions. We present results for both inference and learning to demonstrate the effectiveness of this approach.

  @InProceedings{lawrence-mixtures98, title = {Mixture Representations for Inference and Learning in Boltzmann Machines}, author = {Neil D. Lawrence and Christopher M. Bishop and Michael I. Jordan}, booktitle = {Uncertainty in Artificial Intelligence}, pages = {320}, year = {1998}, editor = {Gregory F. Cooper and Serafín Moral}, volume = {14}, address = {San Francisco, CA}, month = {00}, publisher = {Morgan Kauffman}, edit = {https://github.com/lawrennd//publications/edit/gh-pages/_posts/1998-01-01-lawrence-mixtures98.md}, url = {http://inverseprobability.com/publications/lawrence-mixtures98.html}, abstract = {Boltzmann machines are undirected graphical models with two-state stochastic variables, in which the logarithms of the clique potentials are quadratic functions of the node states. They have been widely studied in the neural computing literature, although their practical applicability has been limited by the difficulty of finding an effective learning algorithm. One well-established approach, known as mean field theory, represents the stochastic distribution using a factorized approximation. However, the corresponding learning algorithm often fails to find a good solution. We conjecture that this is due to the implicit uni-modality of the mean field approximation which is therefore unable to capture multi-modality in the true distribution. In this paper we use variational methods to approximate the stochastic distribution using multi-modal *mixtures* of factorized distributions. We present results for both inference and learning to demonstrate the effectiveness of this approach.}, crossref = {Cooper:uai98}, key = {Lawrence:mixtures98}, linkpsgz = {http://www.thelawrences.net/neil/boltzmann.ps.gz}, OPTgroup = {} }
 %T Mixture Representations for Inference and Learning in Boltzmann Machines %A Neil D. Lawrence and Christopher M. Bishop and Michael I. Jordan %B %C Uncertainty in Artificial Intelligence %D %E Gregory F. Cooper and Serafín Moral %F lawrence-mixtures98 %I Morgan Kauffman %P 320--327 %R %U http://inverseprobability.com/publications/lawrence-mixtures98.html %V 14 %X Boltzmann machines are undirected graphical models with two-state stochastic variables, in which the logarithms of the clique potentials are quadratic functions of the node states. They have been widely studied in the neural computing literature, although their practical applicability has been limited by the difficulty of finding an effective learning algorithm. One well-established approach, known as mean field theory, represents the stochastic distribution using a factorized approximation. However, the corresponding learning algorithm often fails to find a good solution. We conjecture that this is due to the implicit uni-modality of the mean field approximation which is therefore unable to capture multi-modality in the true distribution. In this paper we use variational methods to approximate the stochastic distribution using multi-modal *mixtures* of factorized distributions. We present results for both inference and learning to demonstrate the effectiveness of this approach. 
 TY - CPAPER TI - Mixture Representations for Inference and Learning in Boltzmann Machines AU - Neil D. Lawrence AU - Christopher M. Bishop AU - Michael I. Jordan BT - Uncertainty in Artificial Intelligence PY - 1998/01/01 DA - 1998/01/01 ED - Gregory F. Cooper ED - Serafín Moral ID - lawrence-mixtures98 PB - Morgan Kauffman SP - 320 EP - 327 UR - http://inverseprobability.com/publications/lawrence-mixtures98.html AB - Boltzmann machines are undirected graphical models with two-state stochastic variables, in which the logarithms of the clique potentials are quadratic functions of the node states. They have been widely studied in the neural computing literature, although their practical applicability has been limited by the difficulty of finding an effective learning algorithm. One well-established approach, known as mean field theory, represents the stochastic distribution using a factorized approximation. However, the corresponding learning algorithm often fails to find a good solution. We conjecture that this is due to the implicit uni-modality of the mean field approximation which is therefore unable to capture multi-modality in the true distribution. In this paper we use variational methods to approximate the stochastic distribution using multi-modal *mixtures* of factorized distributions. We present results for both inference and learning to demonstrate the effectiveness of this approach. ER - 
 Lawrence, N.D., Bishop, C.M. & Jordan, M.I.. (1998). Mixture Representations for Inference and Learning in Boltzmann Machines. Uncertainty in Artificial Intelligence 14:320-327