Mixture Representations for Inference and Learning in Boltzmann Machines

Neil D. Lawrence; Christopher M. Bishop; Michael I. Jordan

edit

Back to publications

Mixture Representations for Inference and Learning in Boltzmann Machines

Neil D. Lawrence, Christopher M. Bishop, Michael I. Jordan

Uncertainty in Artificial Intelligence, Morgan Kauffman 14:320-327, 1998.

Abstract

Boltzmann machines are undirected graphical models with two-state stochastic variables, in which the logarithms of the clique potentials are quadratic functions of the node states. They have been widely studied in the neural computing literature, although their practical applicability has been limited by the difficulty of finding an effective learning algorithm. One well-established approach, known as mean field theory, represents the stochastic distribution using a factorized approximation. However, the corresponding learning algorithm often fails to find a good solution. We conjecture that this is due to the implicit uni-modality of the mean field approximation which is therefore unable to capture multi-modality in the true distribution. In this paper we use variational methods to approximate the stochastic distribution using multi-modal mixtures of factorized distributions. We present results for both inference and learning to demonstrate the effectiveness of this approach.

Links

Cite this Paper

BibTeX


@InProceedings{Lawrence:mixtures98,
  title = 	 {Mixture Representations for Inference and Learning in {B}oltzmann Machines},
  author = 	 {Lawrence, Neil D. and Bishop, Christopher M. and Jordan, Michael I.},
  booktitle = 	 {Uncertainty in Artificial Intelligence},
  pages = 	 {320--327},
  year = 	 {1998},
  editor = 	 {Cooper, Gregory F. and Moral, Serafín},
  volume = 	 {14},
  address = 	 {San Francisco, CA},
  publisher =    {Morgan Kauffman},
  pdf = 	 {https://inverseprobability.com/publications/files/boltzmann.pdf},
  url = 	 {/publications/lawrence-mixtures98.html},
  abstract = 	 {Boltzmann machines are undirected graphical models with two-state
stochastic variables, in which the logarithms of the clique
potentials are quadratic functions of the node states. They have
been widely studied in the neural computing literature, although
their practical applicability has been limited by the difficulty of
finding an effective learning algorithm. One well-established
approach, known as mean field theory, represents the stochastic
distribution using a factorized approximation.  However, the
corresponding learning algorithm often fails to find a good
solution.  We conjecture that this is due to the implicit
uni-modality of the mean field approximation which is therefore
unable to capture multi-modality in the true distribution. In this
paper we use variational methods to approximate the stochastic
distribution using multi-modal *mixtures* of factorized
distributions. We present results for both inference and learning to
demonstrate the effectiveness of this approach.
}
}

Endnote

%0 Conference Paper
%T Mixture Representations for Inference and Learning in Boltzmann Machines
%A Neil D. Lawrence
%A Christopher M. Bishop
%A Michael I. Jordan
%B Uncertainty in Artificial Intelligence
%D 1998
%E Gregory F. Cooper
%E Serafín Moral	
%F Lawrence:mixtures98
%I Morgan Kauffman
%P 320--327
%U /publications/lawrence-mixtures98.html
%V 14
%X Boltzmann machines are undirected graphical models with two-state
stochastic variables, in which the logarithms of the clique
potentials are quadratic functions of the node states. They have
been widely studied in the neural computing literature, although
their practical applicability has been limited by the difficulty of
finding an effective learning algorithm. One well-established
approach, known as mean field theory, represents the stochastic
distribution using a factorized approximation.  However, the
corresponding learning algorithm often fails to find a good
solution.  We conjecture that this is due to the implicit
uni-modality of the mean field approximation which is therefore
unable to capture multi-modality in the true distribution. In this
paper we use variational methods to approximate the stochastic
distribution using multi-modal *mixtures* of factorized
distributions. We present results for both inference and learning to
demonstrate the effectiveness of this approach.

RIS


TY  - CPAPER
TI  - Mixture Representations for Inference and Learning in Boltzmann Machines
AU  - Neil D. Lawrence
AU  - Christopher M. Bishop
AU  - Michael I. Jordan
BT  - Uncertainty in Artificial Intelligence
DA  - 1998/01/01
ED  - Gregory F. Cooper
ED  - Serafín Moral	
ID  - Lawrence:mixtures98
PB  - Morgan Kauffman
VL  - 14
SP  - 320
EP  - 327
L1  - https://inverseprobability.com/publications/files/boltzmann.pdf
UR  - /publications/lawrence-mixtures98.html
AB  - Boltzmann machines are undirected graphical models with two-state
stochastic variables, in which the logarithms of the clique
potentials are quadratic functions of the node states. They have
been widely studied in the neural computing literature, although
their practical applicability has been limited by the difficulty of
finding an effective learning algorithm. One well-established
approach, known as mean field theory, represents the stochastic
distribution using a factorized approximation.  However, the
corresponding learning algorithm often fails to find a good
solution.  We conjecture that this is due to the implicit
uni-modality of the mean field approximation which is therefore
unable to capture multi-modality in the true distribution. In this
paper we use variational methods to approximate the stochastic
distribution using multi-modal *mixtures* of factorized
distributions. We present results for both inference and learning to
demonstrate the effectiveness of this approach.

ER  -

APA


Lawrence, N.D., Bishop, C.M. & Jordan, M.I.. (1998). Mixture Representations for Inference and Learning in Boltzmann Machines. Uncertainty in Artificial Intelligence 14:320-327 Available from /publications/lawrence-mixtures98.html.