Thesis Defense: Joanna Materzynska, "Steering Vision at Scale: From the Model Weights to Training Data"
Speaker
Host
Abstract:
As generative vision models grow in scale and impact, so does the need for effective mechanisms to steer their behavior. In this talk, I present a unified framework for understanding controllability in generative models, highlighting interventions at multiple levels. I begin by exploring techniques for editing model weights to suppress or erase specific concepts, enabling safer and more intentional outputs. Next, I demonstrate how models can learn novel dynamic concepts and generalize from just a few examples. To address the limitations of discrete or unintuitive text-based prompts, I introduce methods for achieving continuous control over generation. Finally, I present the idea of opt-in models—models deliberately restricted to photographic domains—that, surprisingly, can still generalize to artistic styles from minimal exposure. Together, these approaches show that artistic expression and ethical constraints can coexist, even without reliance on large-scale scraped datasets.
Bio:
Joanna Materzyńska is a final-year PhD candidate in Computer Vision and Machine Learning at the Massachusetts Institute of Technology (MIT), advised by Prof. Antonio Torralba. She holds a Master’s degree from the University of Oxford, where she completed her thesis on “Disentangling Structured Knowledge from Images and Videos” under the supervision of Prof. Philip Torr. Joanna has worked on generative AI for content creation during research internships at Netflix and Adobe. Her current research explores the controllability of deep generative models and the copyright implications of generative art.
Thesis Committee:
Antonio Torralba
William Freeman
David Bau (Northeastern University)