Feature Selection with the Potential Support Vector Machine

Sepp Hochreiter

We introduce a new feature selection method, called ``Potential Support Vector Machine'' (P-SVM),  which is based  on the support vector machine technique and extracts features from a data set which are important for constructing a classifier.  Like  standard SVMs, the P-SVM is based on the structural risk minimization principle but in contrast to standard  SVMs it minimizes a new objective function. This new objective expresses an upper bound on the generalization error which is obtained by computing bounds through covering numbers. Another difference to standard SVMs is  that the P-SVM's class separating hyperplane is described by feature vectors (the so-called ``support features'') which  formally assume the role of support vectors.  Therefore, feature selection is simply the identification of support vectors. To introduce feature vectors, the P-SVM treats the given data matrix as dot product matrix between feature vectors and vectors to classify. For example a gene expression matrix is considered as a matrix of dot products between fixed gene vectors and variable tissue vectors.  To ensure that a data matrix is indeed a dot product matrix, the measurement of a feature in an object must obey a given protocol so that the expected feature value for an object remains constant. The performance of the P-SVM feature selection method is demonstrated on data sets obtained from patients with certain types of cancer (brain tumor, lymphoma, and breast cancer), where the outcome of a chemo- or radiation therapy must be predicted based on the gene expression profile.  For classification after P-SVM feature selection, generalization performance is improved compared to previously proposed methods.  Additionally, the P-SVM extracts genes which may be important for drug development.