Optimising Kernel Parameters and Regularisation Coefficients for Non-linear Discriminant Analysis
Journal of Machine Learning Research, 7:455-491, 2006.
In this paper we consider a novel Bayesian interpretation of Fisher’s discriminiant analysis. We relate Rayleigh’s coefficient to a noise model that minimizes a cost based on the most probable class centres and that abandons the ‘regression to the labels’ assumption used by other algorithms. This yields a direction of discrimination equivalent to Fisher’s discriminant. We use Bayes’ rule to infer the posterior distribution for the direction of discrimination and in this process, priors and constraining distributions are incorporated to reach the desired result. Going further, with the use of a Gaussian process prior we show the equivalence of our model to a regularised kernel Fisher’s discriminant. A key advantage of our approach is the facility to determine kernel parameters and the regularisation coefficient through the optimisation of the marginal log-likelihood of the data. An added bonus of the new formulation is that it enables us to link the regularisation coefficient with the generalisation error.