[Challenge web site] [Data and code] [QuickStart Guide] [Tech Report] [Book]

Feature Selection with CLOP 

Isabelle Guyon

  • What is CLOP?

CLOP is a Matlab package developed on top of the Spider for the WCCI 2006 performance prediction challenge. Students of the WS2005/06 class on feature extraction at the ETH Zurich have used it to outperform the best results of the NIPS 2003 feature selection challenge.

  • Stunning results!

The students (all undergraduates) were provided with baseline models implemented with CLOP. As part of the class requirements, they were asked to outperfom the baseline methods. They were given extra credit if they outperformed or matched the best challenge entry (within the statistical error bar). They suceeded (Figure 1)! Note that the baseline method was already in the tenth percentile of the best entries (Figure 2)!

Philip Gardner

BER bar graph
Figure 1: Balanced error rates (BER) comparison.

(93 MB, including the datasets of the feature selection challenge, CLOP, and the baseline models), or just

download the datasets from the website of the feature selection challenge or from the NIPS2003 feature extraction workshop page, and the

CLOP package
from the website of the performance prediction challenge.

Your next challenge: outperform the results of the performance prediction challenge using CLOP!

histogram error
Figure 2: The density of challenge entries as a function of BER.
  • Get more information
Teaching material is available from the website of the feature extraction class. The students were asked to make a poster summarizing their results. Here they are:                           
- Jiwen Li
- Theodor Mader
- Patrick Pletscher
- Georg Schneider
- Markus Uhr
These results were published in:

Competitive baseline methods set new standards for the NIPS 2003 feature selection benchmark, I. Guyon, J. Li, T. Mader, P. A. Pletscher, G. Schneider and M. Uhr, Pattern Recognition Letters, Vol. 28:12, Sept. 2007, Pages 1438-1444. [pdf][pdf of larger techreport].

Chih-Jen Lin's has also used the same datasets in a class he taught.

We edited a book on feature extraction summarizing the results of the feature selection challenge and including tutorial chapters. There is a videotaped short course using this material.