ERRATUM

We apologize for an experimental methodology flaw of the paper "Gene selection for Cancer Classification". The problem concerns mostly Figure 4 (colon cancer) where several methods are being compared with the leave-one-out cross-validation method (LOO). Here, LOO was performed to assess the performance of the classifiers, using a fixed set of features previously selected with the whole training data set. The result tables of both Leukemia and colon cancer quote both these bad LOO results and test set results.

The proper way to conduct LOO for feature selection is to avoid using a fixed set of features selected with the whole training data set, because this induces a bias in the results. Instead, one should withhold a pattern, select features, and assess the performance of the classifier with the selected features using the left out example. One then rotates over all the examples, recomputing the feature set and the classifier parameters each time. Note that in this way, the performance of a classifier using a given number of features can be assessed, not the predictive power of a given  feature subset. This later problem is better addressed using an independent test set.

Preprocessing the entire dataset before conducting the experiments also may introduce bias. Therefore, it is advisable to include the preprocessing in the LOO loop as well.

Several papers have made similar mistakes, as pointed out recently in:
Selection bias in gene extraction on the basis of microarray gene-expression data, Christophe Ambroise and Geoffrey J. McLachlan, PNAS 2002 99: 6562-6566. http://www.pnas.org/cgi/reprint/99/10/6562.pdf
This paper provides valuable suggestions on how to conduct experiments properly. The original paper of Golub et al. that inspired our work had a proper experimental design, although we overlooked the methodology, made explicit in footnote 22. See Golub et al (1999). Molecular classification of cancer: Class discovery and class prediction by gene expression monitoring, Science, 531-537. http://www.ib3.gmu.edu/gref/S02/csi739/golub_science1999.pdf

Other papers have benchmarked SVM RFE (and variants of the algorithm) against other methods, including:

S. Ramaswamy et al. Multiclass cancer diagnosis using tumor gene expression signatures. PNAS, vol.98, No26, pp. 15149-15154, December, 2001.
See also: http://www-genome.wi.mit.edu/mpr/publications/projects/Global_Cancer_Map/PNAS_Supplementary_Information.pdf

B. Scholkopf, I. Guyon, and J. Weston. Statistical learning and kernel methods in bioinformatics. In proceedings NATO Advanced Studies Inst. on Artificial Intelligence and Heuristics Methods for Bioinformatics, San Miniato, Italy October
1-11. 2001 http://www.clopinet.com/isabelle/Papers/kerbioinfo.pdf

J. Weston, A. Elisseeff, B. Schölkopf, and M. Tipping. Use of the zero norm with linear models and kernel methods. JMLR, 3(Mar):1439-1461, 2003.
http://www.jmlr.org/papers/volume3/weston03a/weston03a.pdf

A. Rakotomamonjy. Variable selection using SVM-based criteria. JMLR special Issue on Variable and Feature selection, JMLR, 3(Mar):1357-1370, 2003.
http://www.jmlr.org/papers/volume3/rakotomamonjy03a/rakotomamonjy03a.pdf

C. Furlanello, M. Serafini, S. Merler, and G. Jurman. Entropy-Based Gene Ranking without Selection Bias for the Predictive Classification of Microarray Data. BMC Bioinformatics, (4):54, 2003. http://www.biomedcentral.com/1471-2105/4/54
and
C. Furlanello, M. Serafini, S. Merler, and G. Jurman. An accelerated procedure for recursive feature ranking on microarray data. Neural Networks,
16(5-6):641-648, 2003.  http://mpa.itc.it/papers/biblio03-bib.html#merler2003anaccelerated

Y. Li, C. Campbell, and M. Tipping. Bayesian automatic relevance determination algorithms for classifying gene expression data. Bioinformatics, 18 (10) pp. 1332-1339, 2002.  http://bioinformatics.cs.vt.edu/~easychair/LiCampbellTipping_Bioinformatics_2002.pdf

J. Zhu and T. Hastie Classification of Gene Microarrays by Penalized Logistic Regression (to appear, Biostatistics) .  http://www-stat.stanford.edu/~hastie/Papers/plr.ps

P. M. Long and V. B. Vega. Boosting and microarray data. Machine Learning, 52(1):31-44, 2003. http://www1.cs.columbia.edu/~plong/publications/boosting_microarray.pdf

J. Zhu, S. Rosset, T. Hastie and, R. Tibshirani, 1-norm support vector machines. In Proceedings NIPS 2003. http://books.nips.cc/papers/files/nips16/NIPS2003_AA07.pdf

C. Gentile. Fast Feature Selection from Microarray Expression Data via Multiplicative Large Margin Algorithms. In Proceedings NIPS 2003. http://books.nips.cc/papers/files/nips16/NIPS2003_AA16.pdf

H. Fröhlich, O. Chapelle, and B. Schölkopf Feature Selection for Support Vector Machines by Means of Genetic Algorithms, 15th IEEE International Conference on Tools with Artificial Intelligence, 2003. http://www-ra.informatik.uni-tuebingen.de/mitarb/froehlich/Publikationen/342_froehlich_hf.ps

K.  Fujarewicz and M. Wiench. Selecting differentially expressed genes for colon tumor classification. Int. J. Appl. Math. Comput. Sci., 2003, Vol. 13, No. 3, 327–335  http://matwbn.icm.edu.pl/ksiazki/amc/amc13/amc1330.pdf

W. S. Noble. Support vector machine applications in computational biology.  In Kernel Methods in Computational Biology. B. Schoelkopf, K. Tsuda and J.-P. Vert, eds. MIT Press, 2004.  http://noble.gs.washington.edu/papers/noble_support.html

B. Krishnapuram, L. Carin, A. Hartemink,  Gene Expression Analysis: Joint Feature Selection and Classifier Design. In Kernel Methods in Computational Biology, Schölkopf, B., Tsuda, K., & Vert, J.-P., eds. MIT Press, 2004.  http://www.cs.duke.edu/~amink/publications/papers/hartemink04.kernelbook.pdf
 
Guido Steiner; Laura Suter; Franziska Boess; Rodolfo Gasser; Maria Cristina de Vera; Silvio Albertini; Stefan Ruepp, Discriminating Different Classes of Toxicants by Transcript Profiling, 2004. http://www.medscape.com/viewarticle/489069_1  

Rong Xiao, Boosting Chain Learning for Object Detection, Paper ID _ 450, ICCV 2003.
http://research.microsoft.com/~t-rxiao/Publication/iccv2003.pdf

Lal, T. N., Hinterberger, T., Widman, G., Schröder, M., Hill, J., Rosenstiel, W., Elger, C. E., Schölkopf, B. and Birbaumer, N.: Methods Towards Invasive Human Brain Computer Interfaces.
NIPS 2004. http://www-ti.informatik.uni-tuebingen.de/%7Eschroedm/papers/NIPS2004.pdf

Asa Ben-Hur 1 and Douglas Brutlag Sequence motifs: highly predictive features of protein function. In Feature extraction, fundamentals and applications, I. Guyon et al Eds. Springer, to appear.

Textbooks teaching RFE:

Support Vector Machine In Chemistry
ISBN/SKU 9812389229
Author NIANYI CHEN
http://www.thattechnicalbookstore.com/b9812389229.htm

Analyzing Microarray Gene Expression Data (Wiley Series in Probability and Statistics)-US-
ISBN:0471226165
McLachlan, Geoffrey J. /Do, Kim-Anh /Ambroise, Christophe /DO, K.-A /A Publisher:Wiley-Interscience
Published 2004/07
http://bookweb.kinokuniya.co.jp/guest/cgi-bin/booksea.cgi?ISBN=0471226165
 

Software packages implementing the algorithm are also available:

Gist - a C package by William Stafford Noble and Paul Pavlidis:
http://microarray.cpmc.columbia.edu/gist/gist-rfe.html

PyML - a Python Machine Learning package by Asa Ben-Hur
http://www.technion.ac.il/~asa/pyml/tutorial/node9.html

Spider - a Matlab package by Jason Weston, Andre Elisseeff , Goekhan Bakir , Fabian Sinz
http://www.kyb.tuebingen.mpg.de/bs/people/spider/help_rfe.html