Learning to See the Physical World

Speaker

MIT CSAIL

Host

Bill Freeman, Josh Tenenbaum

MIT

Human intelligence is beyond pattern recognition. From a single image, we're able to explain what we see, reconstruct the scene in 3D, predict what's going to happen, and plan our actions accordingly. In this talk, I will present our recent work on physical scene understanding---building versatile, data-efficient, and generalizable machines that learn to see, reason about, and interact with the physical world. The core idea is to exploit the generic, causal structure behind the world, including knowledge from computer graphics, physics, and language, in the form of approximate simulation engines, and to integrate them with deep learning. Here, deep learning plays two major roles: first, it learns to invert simulation engines for efficient inference; second, it learns to augment simulation engines for constructing powerful forward models. Built upon this idea, our system may efficiently construct scene representations for both object geometry and physics, and use them for planning and control.

Add to Calendar 2019-09-09 13:00:00 2019-09-09 14:00:00 America/New_York Learning to See the Physical World Human intelligence is beyond pattern recognition. From a single image, we're able to explain what we see, reconstruct the scene in 3D, predict what's going to happen, and plan our actions accordingly. In this talk, I will present our recent work on physical scene understanding---building versatile, data-efficient, and generalizable machines that learn to see, reason about, and interact with the physical world. The core idea is to exploit the generic, causal structure behind the world, including knowledge from computer graphics, physics, and language, in the form of approximate simulation engines, and to integrate them with deep learning. Here, deep learning plays two major roles: first, it learns to invert simulation engines for efficient inference; second, it learns to augment simulation engines for constructing powerful forward models. Built upon this idea, our system may efficiently construct scene representations for both object geometry and physics, and use them for planning and control. 32-G449, Kiva

Organizer & Contact

Jiajun Wu

jiajunwu@csail.mit.edu

Learning to See the Physical World

Speaker

Host

September 09 2019

Location

Organizer & Contact

November 05

Foundation of Prenatal Risk

October 15

H-Nets: Dynamic Chunking for End-to-End Hierarchical Sequence Modeling

Learning to See the Physical World

Speaker

Host

September 09 2019

Location

Organizer & Contact

Related Events

November 05

Foundation of Prenatal Risk

October 15

H-Nets: Dynamic Chunking for End-to-End Hierarchical Sequence Modeling