Yoshua Bengio bengioy@IRO.UMontreal.CA
Université de Montréal
One way to deal with high-dimensional data is to select a subset of variables. Another is to look for interesting combinations of these variables, which explain the main variations in the data. Unsupervised dimensionality reduction has made much progress in recent years, especially to provide globally optimizable criteria that yield non-linear transformations of the variables into a low-dimensional representation, e.g. Local Linear Embedding, Isomap, Laplacian Eigenmaps, and kernel PCA. We review these methods and show that they can all fit in a common framework, that of learning eigenfunctions of a linear operator associated with a similarity function and with the underlying data distribution. This allows one to extend them to provide a representation not only for the training data but also for new test points without having to re-run the algorithm. We conclude by discussing the limitations of these methods, which are all heavily relying on having enough data to map out the local structure of the density.