ICML 2011 workshop on unsupervised
and transfer learning
Clustering:
Science or Art?
Ulrike
von Luxburg
Max Planck Institute for Intelligent Systems, Tuebingen, Germany
Robert C. Williamson
Australian National University, Australia
Isabelle Guyon
ClopiNet, Berkeley,California
We examine whether the quality of different clustering algorithms can
be compared by a general, scientifically sound procedure, which is
independent of particular clustering algorithms. We argue that the
major obstacle is the difficulty in evaluating a clustering algorithm
without taking into account the context: why does the user cluster his
data in the first place, and what does he want to do with the
clustering afterwards? We argue that clustering should not be treated
as an application-independent mathematical problem, but should always
be studied in the context of its end-use. Different techniques to
evaluate clustering algorithms have to be developed for different uses
of clustering. To simplify this procedure we argue that it will be
useful to build a "taxonomy of clustering problems" to identify
clustering applications which can be treated in a unified way and that
such an effort will be more fruitful than attempting the
impossible|developing "optimal" domainindependent clustering algorithms
or even classifying clustering algorithms in terms of how they work.