# Optimising Kernel Parameters and Regularisation Coefficients for Non-linear Discriminant Analysis

Tonatiuh Peña-Centeno
Neil D. Lawrence, University of Sheffield

Journal of Machine Learning Research 7, pp 455-491

#### Abstract

In this paper we consider a novel Bayesian interpretation of Fisher’s discriminiant analysis. We relate Rayleigh’s coefficient to a noise model that minimizes a cost based on the most probable class centres and that abandons the ‘regression to the labels’ assumption used by other algorithms. This yields a direction of discrimination equivalent to Fisher’s discriminant. We use Bayes’ rule to infer the posterior distribution for the direction of discrimination and in this process, priors and constraining distributions are incorporated to reach the desired result. Going further, with the use of a Gaussian process prior we show the equivalence of our model to a regularised kernel Fisher’s discriminant. A key advantage of our approach is the facility to determine kernel parameters and the regularisation coefficient through the optimisation of the marginal log-likelihood of the data. An added bonus of the new formulation is that it enables us to link the regularisation coefficient with the generalisation error.

  @Article{pena-fbd04, title = {Optimising Kernel Parameters and Regularisation Coefficients for Non-linear Discriminant Analysis}, journal = {Journal of Machine Learning Research}, author = {Tonatiuh Peña-Centeno and Neil D. Lawrence}, pages = {455}, year = {2006}, volume = {7}, month = {00}, edit = {https://github.com/lawrennd//publications/edit/gh-pages/_posts/2006-02-01-pena-fbd04.md}, url = {http://inverseprobability.com/publications/pena-fbd04.html}, abstract = {In this paper we consider a novel Bayesian interpretation of Fisher’s discriminiant analysis. We relate Rayleigh’s coefficient to a noise model that minimizes a cost based on the most probable class centres and that abandons the ‘regression to the labels’ assumption used by other algorithms. This yields a direction of discrimination equivalent to Fisher’s discriminant. We use Bayes’ rule to infer the posterior distribution for the direction of discrimination and in this process, priors and constraining distributions are incorporated to reach the desired result. Going further, with the use of a Gaussian process prior we show the equivalence of our model to a regularised kernel Fisher’s discriminant. A key advantage of our approach is the facility to determine kernel parameters and the regularisation coefficient through the optimisation of the marginal log-likelihood of the data. An added bonus of the new formulation is that it enables us to link the regularisation coefficient with the generalisation error.}, key = {Pena-fbd04}, annote = {An earlier version is available as technical report number CS-04-13, see \cite{Pena:fbd-tech04}.}, linkpdf = {http://www.jmlr.org/papers/volume7/centeno06a/centeno06a.pdf}, linksoftware = {http://inverseprobability.com/bfd/}, group = {shefml} }
 %T Optimising Kernel Parameters and Regularisation Coefficients for Non-linear Discriminant Analysis %A Tonatiuh Peña-Centeno and Neil D. Lawrence %B %C Journal of Machine Learning Research %D %F pena-fbd04 %J Journal of Machine Learning Research %P 455--491 %R %U http://inverseprobability.com/publications/pena-fbd04.html %V 7 %X In this paper we consider a novel Bayesian interpretation of Fisher’s discriminiant analysis. We relate Rayleigh’s coefficient to a noise model that minimizes a cost based on the most probable class centres and that abandons the ‘regression to the labels’ assumption used by other algorithms. This yields a direction of discrimination equivalent to Fisher’s discriminant. We use Bayes’ rule to infer the posterior distribution for the direction of discrimination and in this process, priors and constraining distributions are incorporated to reach the desired result. Going further, with the use of a Gaussian process prior we show the equivalence of our model to a regularised kernel Fisher’s discriminant. A key advantage of our approach is the facility to determine kernel parameters and the regularisation coefficient through the optimisation of the marginal log-likelihood of the data. An added bonus of the new formulation is that it enables us to link the regularisation coefficient with the generalisation error. 
 TY - CPAPER TI - Optimising Kernel Parameters and Regularisation Coefficients for Non-linear Discriminant Analysis AU - Tonatiuh Peña-Centeno AU - Neil D. Lawrence PY - 2006/02/01 DA - 2006/02/01 ID - pena-fbd04 SP - 455 EP - 491 L1 - http://www.jmlr.org/papers/volume7/centeno06a/centeno06a.pdf UR - http://inverseprobability.com/publications/pena-fbd04.html AB - In this paper we consider a novel Bayesian interpretation of Fisher’s discriminiant analysis. We relate Rayleigh’s coefficient to a noise model that minimizes a cost based on the most probable class centres and that abandons the ‘regression to the labels’ assumption used by other algorithms. This yields a direction of discrimination equivalent to Fisher’s discriminant. We use Bayes’ rule to infer the posterior distribution for the direction of discrimination and in this process, priors and constraining distributions are incorporated to reach the desired result. Going further, with the use of a Gaussian process prior we show the equivalence of our model to a regularised kernel Fisher’s discriminant. A key advantage of our approach is the facility to determine kernel parameters and the regularisation coefficient through the optimisation of the marginal log-likelihood of the data. An added bonus of the new formulation is that it enables us to link the regularisation coefficient with the generalisation error. ER - 
 Peña-Centeno, T. & Lawrence, N.D.. (2006). Optimising Kernel Parameters and Regularisation Coefficients for Non-linear Discriminant Analysis. Journal of Machine Learning Research 7:455-491