ICML 2011 workshop on unsupervised
and transfer learning
Information Theoretic Model Validation by Approximate Optimization
Joachim M. Buhmann
Department of Computer Science, ETH Zurich
Model selection in pattern recognition requires (i) to
specify a suitable cost function for the data interpretation and (ii)
to control the degrees of freedom depending on the noise level in the
data. We advocate an information theoretic perspective where the
uncertainty in the measurements quantizes the solution space of the
underlying optimization problem, thereby adaptively regularizing the
cost function. A pattern recognition model, which can tolerate a
higher level of fluctuations in the measurements than alternative
models, is considered to be superior provided that the solution is
equally informative. The optimal tradeoff between "informativeness" and
"robustness" is quantified by the approximation capacity of the
selected cost function.
Empirical evidence for this model selection concept is provided by
cluster validation in computer security, i.e., multilabel clustering
of Boolean data for role based access control, but also in high
dimensional Gaussian mixture models and the analysis of microarray
data. Furthermore, the approximation capacity of the SVD cost function
suggests an optimal cutoff value for the SVD spectrum.