Optimising Kernel Parameters and Regularisation Coefficients for Non-linear Discriminant Analysis

[edit]

Tonatiuh Peña-Centeno
Neil D. Lawrence, University of Sheffield

Journal of Machine Learning Research 7, pp 455-491

Related Material

Abstract

In this paper we consider a novel Bayesian interpretation of Fisher’s discriminiant analysis. We relate Rayleigh’s coefficient to a noise model that minimizes a cost based on the most probable class centres and that abandons the ‘regression to the labels’ assumption used by other algorithms. This yields a direction of discrimination equivalent to Fisher’s discriminant. We use Bayes’ rule to infer the posterior distribution for the direction of discrimination and in this process, priors and constraining distributions are incorporated to reach the desired result. Going further, with the use of a Gaussian process prior we show the equivalence of our model to a regularised kernel Fisher’s discriminant. A key advantage of our approach is the facility to determine kernel parameters and the regularisation coefficient through the optimisation of the marginal log-likelihood of the data. An added bonus of the new formulation is that it enables us to link the regularisation coefficient with the generalisation error.


@Article{pena-fbd04,
  title = 	 {Optimising Kernel Parameters and Regularisation Coefficients for Non-linear Discriminant Analysis},
  journal =  	 {Journal of Machine Learning Research},
  author = 	 {Tonatiuh Peña-Centeno and Neil D. Lawrence},
  pages = 	 {455},
  year = 	 {2006},
  volume = 	 {7},
  month = 	 {00},
  edit = 	 {https://github.com/lawrennd//publications/edit/gh-pages/_posts/2006-02-01-pena-fbd04.md},
  url =  	 {http://inverseprobability.com/publications/pena-fbd04.html},
  abstract = 	 {In this paper we consider a novel Bayesian interpretation of Fisher’s discriminiant analysis. We relate Rayleigh’s coefficient to a noise model that minimizes a cost based on the most probable class centres and that abandons the ‘regression to the labels’ assumption used by other algorithms. This yields a direction of discrimination equivalent to Fisher’s discriminant. We use Bayes’ rule to infer the posterior distribution for the direction of discrimination and in this process, priors and constraining distributions are incorporated to reach the desired result. Going further, with the use of a Gaussian process prior we show the equivalence of our model to a regularised kernel Fisher’s discriminant. A key advantage of our approach is the facility to determine kernel parameters and the regularisation coefficient through the optimisation of the marginal log-likelihood of the data. An added bonus of the new formulation is that it enables us to link the regularisation coefficient with the generalisation error.},
  key = 	 {Pena-fbd04},
  annote = 	 {An earlier version is available as technical report number CS-04-13, see \cite{Pena:fbd-tech04}.},
  linkpdf = 	 {http://www.jmlr.org/papers/volume7/centeno06a/centeno06a.pdf},
  linksoftware = {http://inverseprobability.com/bfd/},
  group = 	 {shefml}
 

}
%T Optimising Kernel Parameters and Regularisation Coefficients for Non-linear Discriminant Analysis
%A Tonatiuh Peña-Centeno and Neil D. Lawrence
%B 
%C Journal of Machine Learning Research
%D 
%F pena-fbd04
%J Journal of Machine Learning Research	
%P 455--491
%R 
%U http://inverseprobability.com/publications/pena-fbd04.html
%V 7
%X In this paper we consider a novel Bayesian interpretation of Fisher’s discriminiant analysis. We relate Rayleigh’s coefficient to a noise model that minimizes a cost based on the most probable class centres and that abandons the ‘regression to the labels’ assumption used by other algorithms. This yields a direction of discrimination equivalent to Fisher’s discriminant. We use Bayes’ rule to infer the posterior distribution for the direction of discrimination and in this process, priors and constraining distributions are incorporated to reach the desired result. Going further, with the use of a Gaussian process prior we show the equivalence of our model to a regularised kernel Fisher’s discriminant. A key advantage of our approach is the facility to determine kernel parameters and the regularisation coefficient through the optimisation of the marginal log-likelihood of the data. An added bonus of the new formulation is that it enables us to link the regularisation coefficient with the generalisation error.
TY  - CPAPER
TI  - Optimising Kernel Parameters and Regularisation Coefficients for Non-linear Discriminant Analysis
AU  - Tonatiuh Peña-Centeno
AU  - Neil D. Lawrence
PY  - 2006/02/01
DA  - 2006/02/01	
ID  - pena-fbd04	
SP  - 455
EP  - 491
L1  - http://www.jmlr.org/papers/volume7/centeno06a/centeno06a.pdf
UR  - http://inverseprobability.com/publications/pena-fbd04.html
AB  - In this paper we consider a novel Bayesian interpretation of Fisher’s discriminiant analysis. We relate Rayleigh’s coefficient to a noise model that minimizes a cost based on the most probable class centres and that abandons the ‘regression to the labels’ assumption used by other algorithms. This yields a direction of discrimination equivalent to Fisher’s discriminant. We use Bayes’ rule to infer the posterior distribution for the direction of discrimination and in this process, priors and constraining distributions are incorporated to reach the desired result. Going further, with the use of a Gaussian process prior we show the equivalence of our model to a regularised kernel Fisher’s discriminant. A key advantage of our approach is the facility to determine kernel parameters and the regularisation coefficient through the optimisation of the marginal log-likelihood of the data. An added bonus of the new formulation is that it enables us to link the regularisation coefficient with the generalisation error.
ER  -

Peña-Centeno, T. & Lawrence, N.D.. (2006). Optimising Kernel Parameters and Regularisation Coefficients for Non-linear Discriminant Analysis. Journal of Machine Learning Research 7:455-491