Project

VirtualHome: Representing Activities as Programs

We aim to create a virtual environment where agents learn to perform human tasks by executing programs. Furthermore, we aim to develop models that can generate such programs from video or text, enabling agents to understand and imitate such activities.

In this project, we are interested in modeling complex activities that occur in atypical household. We propose to use programs, i.e., sequences of atomic actions and interactions, as a high level representation of complex tasks. Programs are interesting because provide a non-ambiguous representation of a task, and allow agents to execute them. However, nowadays, there is no database providing this type of information. Towards this goal, we crowd-source programs for a variety of activities that happen in people’s homes, via a game-like interface used for teaching kids how to code. Using the collected dataset, we show how we can learn to extract programs directly from natural language descriptions or from videos. We then implement the most common atomic (inter)actions in the Unity3D game engine, and use our programs to “drive” an artificial agent to execute tasks in a simulated household environment. Our VirtualHome simulator allows us to create a large activity video dataset with rich ground-truth, enabling training and testing of video understanding models.

Group

Vision Group

Contact us

If you would like to contact us about our work, please refer to our members below and reach out to one of the group leads directly.

Last updated Oct 13 '17

Research Areas

Graphics & Vision

Project

VirtualHome: Representing Activities as Programs

Group

Contact us

Research Areas

Group

Members

Antonio Torralba

Sanja Fidler

Kevin Ra

Marko Boben

Tingwu Wang

Jiaman Li

Shantanu Jain