ICML 2011 workshop on unsupervised
and transfer learning
Deep Learning of Representations for
Unsupervised and Transfer Learning
Yoshua Bengio
Universite de
Montreal, Canada
Deep learning
algorithms seek to exploit the unknown structure in the input
distribution in order to discover good representations, often at
multiple levels, with higher level learned features defined in terms of
lower level features. The objective is to make these higher-level
representations more abstract, with their individual features more
invariant to most of the variations that are typically present in the
training distribution, while preserving as much as possible of the
information in the input. Ideally, we would like these representations
to disentangle as much as possible the unknown factors of variation
that underlie the training distribution. Such unsupervised learning of
representation can be exploited usefully under the hypothesis that the
input distribution P(x) is structurally related to some task of
interest, say predicting P(y|x). The presentation focusses on why
unsupervised pre-training of representations can be useful, and how it
can be exploited in the transfer learning scenario, where we care
about predictions on examples that are not exactly of the same
distribution as the training distribution.