Context is Environment
Speaker
Sharut Gupta
MIT CSAIL
Host
Behrooz Tahmasebi
MIT CSAIL
Abstract: Two lines of work are taking the central stage in AI research. On the one hand, the community is making increasing efforts to build models that discard spurious correlations and generalize better in novel test environments. Unfortunately, the hard lesson so far is that no proposal convincingly outperforms a simple empirical risk minimization baseline. On the other hand, large language models (LLMs) have erupted as algorithms able to learn in-context, generalizing on-the-fly to eclectic contextual circumstances that users enforce by means of prompting. In this paper, we argue that context is environment, and posit that in-context learning holds the key to better domain generalization. Via extensive theory and experiments, we show that paying attention to context--unlabeled examples as they arrive--allows our proposed In-Context Risk Minimization (ICRM) algorithm to zoom-in on the test environment risk minimizer, leading to significant out-of-distribution performance improvements. From all of this, two messages are worth taking home. Researchers in domain generalization should consider environment as context, and harness the adaptive power of in-context learning. Researchers in LLMs should consider context as environment, to better structure data towards generalization.
Speaker Bio: Sharut Gupta is a second-year Ph.D. student at MIT CSAIL, working with Prof. Stefanie Jegelka. Her research mainly focuses on building robust and generalizable machine learning systems under minimal supervision. She enjoys working on out-of-distribution generalization, self-supervised learning, causal inference, and representation learning.
Speaker Bio: Sharut Gupta is a second-year Ph.D. student at MIT CSAIL, working with Prof. Stefanie Jegelka. Her research mainly focuses on building robust and generalizable machine learning systems under minimal supervision. She enjoys working on out-of-distribution generalization, self-supervised learning, causal inference, and representation learning.