Justin Davis, University of Central Florida


Bayesian Feature Selection for Classification With Possibly Large Number of Classes


Abstract: Classification of high dimensional vectors on the basis of a small number of samples has been an essential problem for at least a decade, but the associated problem of feature selection still has no clear general solution. Here we introduce two Bayesian models for feature selection in high dimensional data, specifically for the purpose of classification. We show that particular cases of our models are akin to familiar ad hoc methods (e.g. ANOVA) and that the general case may be viewed as a natural extension of the feature annealed independence rule (FAIR) introduced by Fan and Fan (2008). We demonstrate that the models perform significantly better in supplying a basis for classification than does a common methodology by examining in-model and biological data.

Advisors: Marianna Pensky and William Crampton, Department of Mathematics and Department of Biology University of Central Florida