Thesis Defense: Towards a Theory of Hierarchical Learning

Speaker: Jake V. Bouvrie , Center for Biological & Computational Learning (CBCL)
Date: June 23 2009
Time: 3:00PM to 4:00PM
Location: Bldg 46 Room 5165 (MIBR Reading Room)
Host: Tomaso Poggio, McGovern Institute, BCS Dept. and CSAIL
Contact: Kathleen D. Sullivan, 617-253-0551, kdsulliv@mit.edu
Relevant URL: http://cbcl.mit.edu/index.html
ABSTRACT:
In recent years, several hierarchical learning models have been developed and applied to a diverse range of practical tasks with much success. Little is known, however, as to why such models work as well as they do. Indeed, most are difficult to analyze, and cannot be easily characterized using the established tools from statistical learning theory.
In this talk we propose a mathematical framework describing a family of hierarchical learning architectures, and analyze both theoretically and empirically particular models expressed within our framework. The primary object of interest is a recursively defined feature map, and its associated kernel. The class of models we consider exploit the fact that data in a wide variety of problems satisfy a decomposability property. Paralleling the primate visual cortex, hierarchies are assembled from alternating filtering and pooling stages that build progressively invariant representations which are simultaneously selective for increasingly complex stimuli. A goal of central importance in the study of hierarchical architectures and the cortex alike, is that of understanding quantitatively the tradeoff between invariance and selectivity, and how invariance and selectivity contribute towards providing an improved representation useful for learning from data.
A reasonable expectation is that an unsupervised hierarchical representation will positively impact the sample complexity of a corresponding supervised learning task.
We conclude with a set of empirical results which experimentally evaluate the key assumptions built into the mathematical framework. In particular, simulations supporting the hypothesis that layered architectures can reduce the sample complexity of a non-trivial learning problem are presented.
(Joint work with L. Rosasco, S. Smale, T. Ezzat and T. Poggio.)
See other events that are part of
See other events happening in June 2009