A Variational Approach to Object-Centric Image Models and World Models

Speaker

Technion

Host

Pulkit Agrawal

Abstract:
Unsupervised latent variable models serve as highly effective tools for representing complex data such as images or world models, relevant for applications such as robotic manipulation, video generation, novelty detection, and many more. Variational Autoencoders (VAEs) provide compact latent representations with stability and efficiency. In this talk, we will explore modern VAEs that mitigate shortcomings of classical approaches such as blurry images, and can be used as a basis for strong world models. The first paper (CVPR 2021 Oral) introduces "Soft-IntroVAE" , a refined approach to introspective variational autoencoders, enhancing training stability and theoretical insights while showcasing its applications. The second paper (ICML 2022) presents "Deep Latent Particles (DLP)" for unsupervised image representation learning, offering disentangled object features, uncertainty estimation, and versatile applications. Building on DLP, the third paper unveils "DDLP," a novel object-centric video prediction, manipulation and generation algorithm with efficiency, interpretability, and state-of-the-art results.

About Tal:
Tal (https://taldatech.github.io) is a third-year Ph.D. student in the Electrical and Computer Engineering faculty at the Technion, where he earned his B.Sc. and M.Sc., under the supervision of Prof. Aviv Tamar. His research interests include unsupervised representation learning, generative modeling and reinforcement learning.

Add to Calendar 2023-08-25 13:30:00 2023-08-25 14:30:00 America/New_York A Variational Approach to Object-Centric Image Models and World Models Abstract:Unsupervised latent variable models serve as highly effective tools for representing complex data such as images or world models, relevant for applications such as robotic manipulation, video generation, novelty detection, and many more. Variational Autoencoders (VAEs) provide compact latent representations with stability and efficiency. In this talk, we will explore modern VAEs that mitigate shortcomings of classical approaches such as blurry images, and can be used as a basis for strong world models. The first paper (CVPR 2021 Oral) introduces "Soft-IntroVAE" , a refined approach to introspective variational autoencoders, enhancing training stability and theoretical insights while showcasing its applications. The second paper (ICML 2022) presents "Deep Latent Particles (DLP)" for unsupervised image representation learning, offering disentangled object features, uncertainty estimation, and versatile applications. Building on DLP, the third paper unveils "DDLP," a novel object-centric video prediction, manipulation and generation algorithm with efficiency, interpretability, and state-of-the-art results. About Tal: Tal (https://taldatech.github.io) is a third-year Ph.D. student in the Electrical and Computer Engineering faculty at the Technion, where he earned his B.Sc. and M.Sc., under the supervision of Prof. Aviv Tamar. His research interests include unsupervised representation learning, generative modeling and reinforcement learning. 32-370

Organizer & Contact

Idan Shenfeld

idanshen@csail.mit.edu

A Variational Approach to Object-Centric Image Models and World Models

Speaker

Host

August 25 2023

Location

Organizer & Contact

May 12

Thesis Defense: Scaling Cooperative Intelligence via Inverse Planning and Probabilistic Programming

May 08

Automatic Integration and Differentiation of Probabilistic Programs

A Variational Approach to Object-Centric Image Models and World Models

Speaker

Host

August 25 2023

Location

Organizer & Contact

Related Events

May 12

Thesis Defense: Scaling Cooperative Intelligence via Inverse Planning and Probabilistic Programming

May 08

Automatic Integration and Differentiation of Probabilistic Programs