# Gaussian Process Models for Visualisation of High Dimensional Data

Neil D. Lawrence, University of Sheffield

in Advances in Neural Information Processing Systems 16, pp 329-336

#### Abstract

In this paper we introduce a new underlying probabilistic model for principal component analysis (PCA). Our formulation interprets PCA as a particular Gaussian process prior on a mapping from a latent space to the observed data-space. We show that if the prior’s covariance function constrains the mappings to be linear the model is equivalent to PCA, we then extend the model by considering less restrictive covariance functions which allow non-linear mappings. This more general Gaussian process latent variable model (GPLVM) is then evaluated as an approach to the visualisation of high dimensional data for three different data-sets. Additionally our non-linear algorithm can be further kernelised leading to ‘twin kernel PCA’ in which a mapping between feature spaces occurs.

  @InProceedings{lawrence-gplvm03, title = {Gaussian Process Models for Visualisation of High Dimensional Data}, author = {Neil D. Lawrence}, booktitle = {Advances in Neural Information Processing Systems}, pages = {329}, year = {2004}, editor = {Sebastian Thrun and Lawrence Saul and Bernhard Schölkopf}, volume = {16}, address = {Cambridge, MA}, month = {00}, publisher = {MIT Press}, edit = {https://github.com/lawrennd//publications/edit/gh-pages/_posts/2004-01-01-lawrence-gplvm03.md}, url = {http://inverseprobability.com/publications/lawrence-gplvm03.html}, abstract = {In this paper we introduce a new underlying probabilistic model for principal component analysis (PCA). Our formulation interprets PCA as a particular Gaussian process prior on a mapping from a latent space to the observed data-space. We show that if the prior’s covariance function constrains the mappings to be linear the model is equivalent to PCA, we then extend the model by considering less restrictive covariance functions which allow non-linear mappings. This more general Gaussian process latent variable model (GPLVM) is then evaluated as an approach to the visualisation of high dimensional data for three different data-sets. Additionally our non-linear algorithm can be *further* kernelised leading to ‘twin kernel PCA’ in which a *mapping* *between feature spaces* occurs.}, crossref = {Thrun:nips03}, key = {Lawrence:gplvm03}, linkpsgz = {ftp://ftp.dcs.shef.ac.uk/home/neil/gplvm.ps.gz}, linksoftware = {http://inverseprobability.com/gplvm/}, group = {shefml,gplvm,dimensional reduction} }
 %T Gaussian Process Models for Visualisation of High Dimensional Data %A Neil D. Lawrence %B %C Advances in Neural Information Processing Systems %D %E Sebastian Thrun and Lawrence Saul and Bernhard Schölkopf %F lawrence-gplvm03 %I MIT Press %P 329--336 %R %U http://inverseprobability.com/publications/lawrence-gplvm03.html %V 16 %X In this paper we introduce a new underlying probabilistic model for principal component analysis (PCA). Our formulation interprets PCA as a particular Gaussian process prior on a mapping from a latent space to the observed data-space. We show that if the prior’s covariance function constrains the mappings to be linear the model is equivalent to PCA, we then extend the model by considering less restrictive covariance functions which allow non-linear mappings. This more general Gaussian process latent variable model (GPLVM) is then evaluated as an approach to the visualisation of high dimensional data for three different data-sets. Additionally our non-linear algorithm can be *further* kernelised leading to ‘twin kernel PCA’ in which a *mapping* *between feature spaces* occurs. 
 TY - CPAPER TI - Gaussian Process Models for Visualisation of High Dimensional Data AU - Neil D. Lawrence BT - Advances in Neural Information Processing Systems PY - 2004/01/01 DA - 2004/01/01 ED - Sebastian Thrun ED - Lawrence Saul ED - Bernhard Schölkopf ID - lawrence-gplvm03 PB - MIT Press SP - 329 EP - 336 UR - http://inverseprobability.com/publications/lawrence-gplvm03.html AB - In this paper we introduce a new underlying probabilistic model for principal component analysis (PCA). Our formulation interprets PCA as a particular Gaussian process prior on a mapping from a latent space to the observed data-space. We show that if the prior’s covariance function constrains the mappings to be linear the model is equivalent to PCA, we then extend the model by considering less restrictive covariance functions which allow non-linear mappings. This more general Gaussian process latent variable model (GPLVM) is then evaluated as an approach to the visualisation of high dimensional data for three different data-sets. Additionally our non-linear algorithm can be *further* kernelised leading to ‘twin kernel PCA’ in which a *mapping* *between feature spaces* occurs. ER - 
 Lawrence, N.D.. (2004). Gaussian Process Models for Visualisation of High Dimensional Data. Advances in Neural Information Processing Systems 16:329-336