[edit]
Mixture Representations for Inference and Learning in Boltzmann Machines
Uncertainty in Artificial Intelligence, Morgan Kauffman 14:320-327, 1998.
Abstract
Boltzmann machines are undirected graphical models with two-state
stochastic variables, in which the logarithms of the clique
potentials are quadratic functions of the node states. They have
been widely studied in the neural computing literature, although
their practical applicability has been limited by the difficulty of
finding an effective learning algorithm. One well-established
approach, known as mean field theory, represents the stochastic
distribution using a factorized approximation. However, the
corresponding learning algorithm often fails to find a good
solution. We conjecture that this is due to the implicit
uni-modality of the mean field approximation which is therefore
unable to capture multi-modality in the true distribution. In this
paper we use variational methods to approximate the stochastic
distribution using multi-modal *mixtures* of factorized
distributions. We present results for both inference and learning to
demonstrate the effectiveness of this approach.