# Fast Sparse Gaussian Process Methods: The Informative Vector Machine

Neil D. Lawrence, University of Sheffield
Matthias Seeger, Amazon
Ralf Herbrich, Amazon

in Advances in Neural Information Processing Systems 15, pp 625-632

#### Abstract

We present a framework for sparse Gaussian process (GP) methods which uses forward selection with criteria based on information-theoretical principles, previously suggested for active learning. In contrast to most previous work on sparse GPs, our goal is not only to learn sparse predictors (which can be evaluated in $O(d)$ rather than $O(n)$, $d«n$, $n$ the number of training points), but also to perform training under strong restrictions on time and memory requirements. The scaling of our method is at most $O(nd^2)$, and in large real-world classification experiments we show that it can match prediction performance of the popular support vector machine (SVM), yet it requires only a fraction of the training time. In contrast to the SVM, our approximation produces estimates of predictive probabilities (‘error bars’), allows for Bayesian model selection and is less complex in implementation.

  @InProceedings{lawrence-ivm02, title = {Fast Sparse Gaussian Process Methods: The Informative Vector Machine}, author = {Neil D. Lawrence and Matthias Seeger and Ralf Herbrich}, booktitle = {Advances in Neural Information Processing Systems}, pages = {625}, year = {2003}, editor = {Sue Becker and Sebastian Thrun and Klaus Obermayer}, volume = {15}, address = {Cambridge, MA}, month = {00}, publisher = {MIT Press}, edit = {https://github.com/lawrennd//publications/edit/gh-pages/_posts/2003-01-01-lawrence-ivm02.md}, url = {http://inverseprobability.com/publications/lawrence-ivm02.html}, abstract = {We present a framework for sparse Gaussian process (GP) methods which uses forward selection with criteria based on information-theoretical principles, previously suggested for active learning. In contrast to most previous work on sparse GPs, our goal is not only to learn sparse predictors (which can be evaluated in $O(d)$ rather than $O(n)$, $d<  %T Fast Sparse Gaussian Process Methods: The Informative Vector Machine %A Neil D. Lawrence and Matthias Seeger and Ralf Herbrich %B %C Advances in Neural Information Processing Systems %D %E Sue Becker and Sebastian Thrun and Klaus Obermayer %F lawrence-ivm02 %I MIT Press %P 625--632 %R %U http://inverseprobability.com/publications/lawrence-ivm02.html %V 15 %X We present a framework for sparse Gaussian process (GP) methods which uses forward selection with criteria based on information-theoretical principles, previously suggested for active learning. In contrast to most previous work on sparse GPs, our goal is not only to learn sparse predictors (which can be evaluated in$O(d)$rather than$O(n)$,$d<
 TY - CPAPER TI - Fast Sparse Gaussian Process Methods: The Informative Vector Machine AU - Neil D. Lawrence AU - Matthias Seeger AU - Ralf Herbrich BT - Advances in Neural Information Processing Systems PY - 2003/01/01 DA - 2003/01/01 ED - Sue Becker ED - Sebastian Thrun ED - Klaus Obermayer ID - lawrence-ivm02 PB - MIT Press SP - 625 EP - 632 UR - http://inverseprobability.com/publications/lawrence-ivm02.html AB - We present a framework for sparse Gaussian process (GP) methods which uses forward selection with criteria based on information-theoretical principles, previously suggested for active learning. In contrast to most previous work on sparse GPs, our goal is not only to learn sparse predictors (which can be evaluated in $O(d)$ rather than $O(n)$, \$d<
 Lawrence, N.D., Seeger, M. & Herbrich, R.. (2003). Fast Sparse Gaussian Process Methods: The Informative Vector Machine. Advances in Neural Information Processing Systems 15:625-632