ICML 2011 workshop on unsupervised and transfer learning

Deep Learning of Representations for Unsupervised and Transfer Learning
Yoshua Bengio
Universite de Montreal, Canada

Deep learning algorithms seek to exploit the unknown structure in the input distribution in order to discover good representations, often at multiple levels, with higher level learned features defined in terms of lower level features. The objective is to make these higher-level representations more abstract, with their individual features more invariant to most of the variations that are typically present in the training distribution, while preserving as much as possible of the information in the input. Ideally, we would like these representations to disentangle as much as possible the unknown factors of variation that underlie the training distribution. Such unsupervised learning of representation can be exploited usefully under the hypothesis that the input distribution P(x) is structurally related to some task of interest, say predicting P(y|x). The presentation focusses on why unsupervised pre-training of representations can be useful, and how it can be exploited in the transfer learning scenario, where we care about  predictions on examples that are not exactly of the same distribution as the training distribution.