# The Informative Vector Machine: A Practical Probabilistic Alternative to the Support Vector Machine

Neil D. Lawrence, University of Sheffield
Matthias Seeger, Amazon
Ralf Herbrich, Amazon

#### Abstract

We present a practical probabilistic alternative to the popular support vector machine (SVM). The algorithm is an approximation to a Gaussian process, and is probabilistic in the sense that it maintains the process variance that is implied by the use of a kernel function, which the SVM discards. We show that these variances may be tracked and made use of selection of an active set which gives a sparse representation for the model. For an active set size of $d$ our algorithm exhibits $O(d^{2}N)$ computational complexity and $O(dN)$ storage requirements. It has already been shown that the approach is comptetive with the SVM in terms of performance and running time, here we give more details of the approach and demonstrate that kernel parameters may also be learned in a practical and effective manner.

  @TechReport{lawrence-ivmtech04, title = {The Informative Vector Machine: A Practical Probabilistic Alternative to the Support Vector Machine}, author = {Neil D. Lawrence and Matthias Seeger and Ralf Herbrich}, year = {2004}, institution = {Department of Computer Science, University of Sheffield}, number = {CS-04-07}, month = {00}, edit = {https://github.com/lawrennd//publications/edit/gh-pages/_posts/2004-01-01-lawrence-ivmtech04.md}, url = {http://inverseprobability.com/publications/lawrence-ivmtech04.html}, abstract = {We present a practical probabilistic alternative to the popular support vector machine (SVM). The algorithm is an approximation to a Gaussian process, and is probabilistic in the sense that it maintains the process variance that is implied by the use of a kernel function, which the SVM discards. We show that these variances may be tracked and made use of selection of an active set which gives a sparse representation for the model. For an active set size of $d$ our algorithm exhibits $O(d^{2}N)$ computational complexity and $O(dN)$ storage requirements. It has already been shown that the approach is comptetive with the SVM in terms of performance and running time, here we give more details of the approach and demonstrate that kernel parameters may also be learned in a practical and effective manner.}, key = {Lawrence:ivmTech04}, note = {Last updated December 2005}, linkpsgz = {ftp://ftp.dcs.shef.ac.uk/home/neil/ivmTechreport.ps.gz}, group = {shefml} }
 %T The Informative Vector Machine: A Practical Probabilistic Alternative to the Support Vector Machine %A Neil D. Lawrence and Matthias Seeger and Ralf Herbrich %B %D %F lawrence-ivmtech04 %P -- %R %U http://inverseprobability.com/publications/lawrence-ivmtech04.html %N CS-04-07 %X We present a practical probabilistic alternative to the popular support vector machine (SVM). The algorithm is an approximation to a Gaussian process, and is probabilistic in the sense that it maintains the process variance that is implied by the use of a kernel function, which the SVM discards. We show that these variances may be tracked and made use of selection of an active set which gives a sparse representation for the model. For an active set size of $d$ our algorithm exhibits $O(d^{2}N)$ computational complexity and $O(dN)$ storage requirements. It has already been shown that the approach is comptetive with the SVM in terms of performance and running time, here we give more details of the approach and demonstrate that kernel parameters may also be learned in a practical and effective manner. 
 TY - CPAPER TI - The Informative Vector Machine: A Practical Probabilistic Alternative to the Support Vector Machine AU - Neil D. Lawrence AU - Matthias Seeger AU - Ralf Herbrich PY - 2004/01/01 DA - 2004/01/01 ID - lawrence-ivmtech04 SP - EP - UR - http://inverseprobability.com/publications/lawrence-ivmtech04.html AB - We present a practical probabilistic alternative to the popular support vector machine (SVM). The algorithm is an approximation to a Gaussian process, and is probabilistic in the sense that it maintains the process variance that is implied by the use of a kernel function, which the SVM discards. We show that these variances may be tracked and made use of selection of an active set which gives a sparse representation for the model. For an active set size of $d$ our algorithm exhibits $O(d^{2}N)$ computational complexity and $O(dN)$ storage requirements. It has already been shown that the approach is comptetive with the SVM in terms of performance and running time, here we give more details of the approach and demonstrate that kernel parameters may also be learned in a practical and effective manner. ER - 
 Lawrence, N.D., Seeger, M. & Herbrich, R.. (2004). The Informative Vector Machine: A Practical Probabilistic Alternative to the Support Vector Machine.(CS-04-07):-