ICML 2011 workshop on unsupervised and transfer learning

Autoencoders, Unsupervised Learning, and Deep Architectures
Pierre Baldi
UC Irvine, California, USA

Autoencoders play a fundamental role in unsupervised learning and in deep architectures for transfer learning and other tasks. In spite of their fundamental role, only linear autoencoders over the real numbers have been solved analytically. Here we present a general mathematical framework for the study of both linear and non-linear autoencoders . The framework allows one to derive an analytical treatment for the most non-linear autoencoder, the Boolean autoencoder, and to consider other classes of linear autoencoders over different fields.  Learning in the Boolean autoencoder is equivalent to a clustering problem that can be solved in polynomial time when the number of clusters is small and becomes NP complete when the number of clusters is large. The framework illuminates the connections between the different kinds of autoencoders, their learning complexity, their horizontal and vertical composability in deep architectures,  the fundamental connections between critical points and clustering, and leads to  a unified treatment of autoencoders, clustering, Hebbian learning, and information theory.
 
Pierre Baldi is Chancellor's Professor in the School of Information and Computer Sciences and the Department of Biological Chemistry  and the Director of the Institute for Genomics and Bioinformatics at the University of California, Irvine. Born and raised in Europe, he received his PhD from the California Institute of Technology in 1986. From 1986 to 1988 he was a postdoctoral fellow at the University of California, San Diego. From 1988 to 1995 he held faculty and member of the technical staff positions at the California Institute of Technology and at the Jet Propulsion Laboratory. He was CEO of a startup company from 1995 to 1999 and joined UCI in 1999.  His research work is at the intersection of the computational and life sciences, in particular the application of AI and statistical machine learning methods to problems in chemoinformatics, genomics, proteomics, and systems biology. Dr. Baldi has published over 250 peer-reviewed research articles and four books. He is the recipient of the 1993 Lew Allen Award, the 1999 Laurel Wilkening Faculty Innovation Award, a 2006 Microsoft Research Award, and the 2010 E. R. Caianiello Prize for research in machine learning. He is a Fellow of the Association for the Advancement of Science (AAAS) and the Association for the Advancement of Artificial Intelligence (AAAI).