THESIS DEFENSE: Manel Baradad, "Learning to See with Synthetic Procedural Images"

Speaker

Manel Baradad
CSAIL

Host

Antonio Torralba
CSAIL
Abstract:
This thesis explores a novel approach to training vision systems using synthetic procedural images generated from code, rather than relying on traditional natural image datasets. We investigate a wide range of procedural image generation techniques, from simple statistical models to complex shader programs and large language model-generated visual concepts. Through extensive experiments, we demonstrate that neural networks trained solely on these procedural images can learn surprisingly effective visual representations that transfer well to real-world tasks. Our work analyzes the properties that make procedural datasets effective for training vision systems, shows how to scale up training to achieve strong performance across vision benchmarks, and explores applications in numerous tasks and domains. By reducing reliance on curated datasets, this approach opens up new possibilities for more efficient and ethical development of robust, general-purpose vision systems, suggesting that procedural image generation represents a promising new paradigm for training advanced computer vision models.

Thesis Committee: Antonio Torralba, Phillip Isola, Bill Freeman