| Data set name |
Domain |
Size (MB) | Data matrix type | Num. ex. (tr/val/te) | Num. feat. |
| ADA | Marketing |
0.6 | Non sparse | 4147 / 415 / 41471 | 48 |
| GINA | Digit recognition |
19.4 | Non sparse | 3153 / 315 / 31532 | 970 |
| HIVA | Drug discovery |
7.6 | Non sparse | 3845 / 384 / 38449 | 1617 |
| NOVA | Text classification |
2.3 | Sparse binary | 1754 / 175 / 17537 | 16969 |
| SYLVA | Ecology |
15.6 | Non sparse | 13086 / 1308 / 130858 | 216 |
| Entrant |
Method |
BER Guess |
Test |
Test |
Test Sigma |
Test Guess Error |
Test Score |
Average Rank |
|
Roman Lutz |
0.10398 |
0.108984 |
0.891016 |
0.002416 |
0.007911 |
0.116482 |
6.2 |
|
|
Gavin Cawley |
0.110463 |
0.11244 |
0.924643 |
0.002462 |
0.003366 |
0.115187 |
7.6 |
|
|
Radford Neal |
0.12008 |
0.111803 |
0.930368 |
0.00246 |
0.012167 |
0.123495 |
7.8 |
|
|
Corinne Dahinden |
0.1087 |
0.115813 |
0.884187 |
0.00248 |
0.010635 |
0.126363 |
7.8 |
|
|
Wei Chu |
0.106902 |
0.115307 |
0.536672 |
0.002504 |
0.008405 |
0.123346 |
8.2 |
|
|
Nicolai Meinshausen |
0.107605 |
0.116665 |
0.883335 |
0.002491 |
0.010973 |
0.127356 |
8.6 |
|
|
Marc Boulle |
0.1306 |
0.130675 |
0.92424 |
0.00267 |
0.009634 |
0.139864 |
10.4 |
|
|
Kari Torkkola & Eugene
Tuv |
0.09904 |
0.119135 |
0.880865 |
0.002524 |
0.020761 |
0.139833 |
14 |
|
|
Olivier Chapelle |
0.113162 |
0.126242 |
0.920822 |
0.00267 |
0.015373 |
0.141382 |
15.8 |
|
|
J. Wichard |
0.12772 |
0.125181 |
0.901349 |
0.002681 |
0.017939 |
0.142989 |
16.6 |
|
|
Advanced Analytical Methods,
INTEL |
0.1123 |
0.136637 |
0.90704 |
0.002654 |
0.027929 |
0.164384 |
16.6 |
|
|
Vladimir Nikulin |
0.1044 |
0.133446 |
0.866661 |
0.002639 |
0.029046 |
0.16239 |
16.8 |
|
|
Edward Harrington |
0.129347 |
0.139374 |
0.899119 |
0.002622 |
0.010027 |
0.149237 |
16.8 |
|
|
Kai |
Chi |
0.1129 |
0.131494 |
0.868506 |
0.002733 |
0.024463 |
0.155948 |
17.6 |
|
Yu-Yen Ou |
0.0966 |
0.127346 |
0.872654 |
0.002656 |
0.033639 |
0.16098 |
17.8 |
|
|
Juha Reunanen |
0.142651 |
0.146752 |
0.913843 |
0.002954 |
0.008484 |
0.154454 |
19 |
|
|
Yen-Jen Oyang |
RBF + ICA ( 3: 1 ) 86 + v |
0.1125 |
0.132369 |
0.867631 |
0.002777 |
0.021092 |
0.153333 |
19 |
|
Patrick Haluptzok |
NN Vanilla |
0.1396 |
0.139603 |
0.860397 |
0.00276 |
0.024002 |
0.163415 |
19.2 |
|
Tobias Glasmachers |
KTA+CV+SVM (3) |
0.135685 |
0.148765 |
0.890652 |
0.002722 |
0.015699 |
0.164224 |
19.6 |
|
Darby Tien-Hao Chang |
PCA+ME+SVM+valid |
0.1425 |
0.150324 |
0.849676 |
0.002537 |
0.018775 |
0.168944 |
21.4 |
|
gavin growup |
chi+ica+com |
0.1061 |
0.134534 |
0.865466 |
0.002782 |
0.028434 |
0.162963 |
21.6 |
|
WHY |
chi + svm |
0.1029 |
0.138013 |
0.861987 |
0.002651 |
0.040983 |
0.178989 |
22 |
|
Seyna |
Fscore+Chi+SVM |
0.10198 |
0.134534 |
0.865466 |
0.002782 |
0.032554 |
0.167083 |
22 |
|
Chunghoon Kim |
2D-CLDA-Quad (2) |
0.1481 |
0.153428 |
0.885056 |
0.002874 |
0.022441 |
0.175736 |
22.6 |
|
Machete |
16-LSVC+adaboost. |
0.120622 |
0.170669 |
0.881446 |
0.002887 |
0.050047 |
0.220686 |
25.2 |
|
decoste |
submit_test |
0.137154 |
0.175312 |
0.901088 |
0.002868 |
0.038639 |
0.213911 |
25.6 |
|
Myoung Soo Park |
Scaling + CA-PCA + ELN |
0.1338 |
0.199105 |
0.800895 |
0.00291 |
0.065305 |
0.26441 |
26.4 |
|
w_pietrus |
CWFS + DT |
0.25776 |
0.193905 |
0.806095 |
0.002728 |
0.279849 |
0.473628 |
27 |
Ranking by dataset
Two of the winners by dataset do not show up in the top
3 ranking participants with overall score.
Table 3: Best entrants by dataset.
| Dataset |
Entrant |
Method |
Test BER |
Test AUC |
Test Guess error |
Test Score |
| ADA |
Marc Boulle |
0.172266 |
0.91491 |
0.007266 |
0.179288 |
|
| GINA |
Kari Torkkola & Eugene
Tuv |
0.028833 |
0.971167 |
0.007266 |
0.030216 |
|
| HIVA |
Gavin Cawley |
0.275695 |
0.7671 |
0.001667 |
0.279689 |
|
| NOVA |
Gavin Cawley |
0.04449 |
0.991362 |
0.0065 |
0.044833 |
|
| SYLVA |
Marc Boulle |
0.00614 |
0.999119 |
0.000873 |
0.00618 |
Result analysis
Dataset profiles
In Figure 1, we show the BER test distribution for final entrants. HIVA
(drug discovery) seems to be the most difficult dataset: the average BER
and the spread are high. ADA (marketing) is the second hardest. The
distribution is very skewed and has a heavy tail, indicating that a small
group of methods "solved" the problem, which was not obvious to others.
NOVA (text classification) and GINA (digit recognition) come next. Both
datasets have classes containing multiple clusters. Hence, the problems
are highly non-linear. This may explain the very long tails. Finally, SYLVA
(ecology) is the easiest dataset, due to the large amount of training data.
In this challenge, many classifiers did well. There is a variety of methods in the top ranking entries:
- Ensembles of decision trees (Roman Lutz, Corinne Dahinden, Intel)
- Kernel methods/SVMs (Gavin Cawley, Wei Chu, Kari Torkkola and Eugene
Tuv, Olivier Chapelle, Vladimir Nikulin, Yu-Yen Ou)
- Bayesian Neural Networks (Radford Neal)
- Ensembles of linear methods (Nikolai Meinshausen)
- Naive Bayes (Marc Boulle)
Others used mixed models. It is interesting to note that single methods
did better than mixed models.
In Figure 3, we show the performances of the final entrants, with symbols
coding for the methods used:
X: Mixed or unknown method
TREE: Ensembles of trees (like Random Forests, RF)
NN/BNN: Neural networks or Bayesian neural networks
NB: Naive Bayes