Feature/Model Selection by the Linear Programming
SVM
Erinija Pranckeviciene, Ray Somorjai, National Research Council Canada,
and Muoi N. Tran ,CancerCare Manitoba, Canada
erinija.pranckevie@nrc-nrcc.gc.ca
Many real-world classification problems are represented by very sparse and
high-dimensional data. The recent successes of a linear programming support
vector machine (LPSVM) for feature selection motivated a deeper analysis of
the method when applied to sparse, multivariate data. Due to the sparseness,
the selection of a classification model is greatly influenced by the characteristics
of that particular dataset. In this study, we investigate a feature selection
strategy based on LPSVM as the initial feature filter, combined with state-of-art
classification rules, and apply to five real-life datasets of the \textbf{Agnostic
learning v.. prior knowledge} challenge of IJCNN2007. Our goal is to better
understand the robustness of LPSVM as a feature filter. Our analysis suggests
that LPSVM can be a useful black box method for identification of the profile
of the informative features in the data. If the data are complex and better
separable by nonlinear methods, then feature pre-filtering by LPSVM enhances
the data representation for other classifiers.