ICML 2011 workshop on unsupervised and transfer learning

Transfer Learning with Cluster Ensembles

Ayan Acharya1
Eduardo R. Hruschka1,2
Joydeep Ghosh1
Sreangsu Acharyya1

1University of Texas (UT) at Austin, USA
2University of Sao Paulo (USP) at Sao Carlos, Brazil

Traditional supervised learning algorithms are usually unsuitable for transfer learning because they assume that the training and test/scoring data come from a common underlying distribution. This problem is exacerbated when the "test" data actually represents a related but different task but there are no labeled examples in the test set to help us readily discern what changes may have come about. We introduce a general optimization framework that takes as input one or more classifiers learned on the original task as well as the results of a cluster ensemble operating solely on the target task data, and yields a consensus labeling of the target data. This framework is general in that it admits a wide range of loss functions and classification/clustering methods. Empirical results on both text and hyperspectral data indicate that the proposed method can yield substantially superior classification results as compared to applying certain other transductive learning techniques or naively applying the classifier (ensembles) learnt on the original task to the target data.