3D generic object categorization, localization and pose estimation

Speaker: Silvio Savarese , University of Illinois, Urbana-Champaign (UIUC)
Date: April 23 2008
Time: 3:00PM to 4:00PM
Location: 32-D507
Host: C. Mario Christoudias, Gerald Dalley, MIT CSAIL
Contact: C. Mario Christoudias, Gerald Dalley, 3-4278, 3-6095, cmch@csail.mit.edu, dalleyg@mit.edu
Relevant URL: NOTES
-----
* This talk is being presented in 32-D507.
* If you would like to meet with Silvio, please contact Maysoon Hamdiyyah , Gerald Dalley , and/or C. Mario Christoudias .
ABSTRACT
--------
In visual recognition, object categorization is one of the most important functionalities. The problem of categorizing generic objects is a very challenging one. Single objects vary in appearances and shapes under various photometric (e.g. illumination) and geometric (e.g. scale, view point, occlusion, etc.) transformations. One of the key issues is to find a suitable representation to encode such a tremendous visual variability. Largely due to the difficulty of this problem, most of the current research in object categorization has focused on modeling object classes in single views or within a small range of planar rotations. But our world is fundamentally 3D and it is crucial to be able to solve the problem of true 3D object categorization for handling arbitrary changes of pose.
In this talk I introduce a novel model to represent and learn generic 3D object categories. We aim to design an algorithm that can recognize and classify images of object categories seen under arbitrary 3D poses. Our approach is to capture a compact model of an object category by linking together diagnostic parts of the objects from different viewing points. Instead of recovering a full 3D geometry, we connect these parts through their mutual homographic transformation. The resulting model is a compact summarization of both the appearance and geometry information of the object class. Unlike earlier attempts for 3D object categorization, our framework requires minimal supervision, instead of careful manual alignment of corresponding view points of all training images. Our results on categorization show superior performances to state-of-the-art algorithms. Furthermore, we are the first to demonstrate 3D object classification, localization and pose estimation in the largest generic 3D object category dataset.
If time permits I will discuss our current research on recognizing as well as synthesizing unseen views of generic object categories. We show that our model based on linked diagnostic parts is well suitable for this highly challenging task.
BIO
---
Silvio Savarese earned his PhD in the Electrical Engineering department at the California Institute of Technology, completing his dissertation on "3D Object Reconstruction from Shadows and Reflections". He joined the University of Illinois at Urbana-Champaign in 2005 as a Beckman Institute Fellow. In 2002 he was a recipient of the Walker von Brimer Award for outstanding research initiative. He received the “Laurea” degree, magna cum laude, from University of Naples, Italy, in 1999. His research interests include computer vision, computer graphics, object and scene recognition, shape representation and reconstruction, human visual perception and visual psychophysics.
See other events that are part of MIT Machine Vision Colloquium 2007/2008
See other events happening in April 2008