Thesis Defense: Saachi Jain, Title: A data-centric perspective on model reliability
Speaker
Saachi Jain, MIT
Host
Committee Members: Aleksander Madry (Chair), Phillip Isola and Antonio Torralba
Abstract: Neural networks can fail to reliably generalize to real-world data---especially since deployment conditions often differ from the training environment. While many factors contribute to the model's behavior in such contexts, recent evidence indicates that the training dataset tends to play a pivotal role. The goal of this thesis is thus to build the foundation for a data-centric perspective on model reliability. It advances this objective through two main thrusts.
The first thrust centers on developing scalable techniques for identifying meaningful patterns of model failures. We further propose a data-based approach to mitigate such failures at their source, by isolating training examples that drive a targeted bias. The second thrust scrutinizes the role of pre-training data in the transfer learning setting. Specifically, we investigate the problem of "bias transfer", where biases from the pre-training data can cause reliability failures in downstream deployments. We then introduce a framework for pinpointing the impact of pre-training examples on downstream predictions, enabling us to identify (and remove) detrimental points from the pre-training dataset.
The first thrust centers on developing scalable techniques for identifying meaningful patterns of model failures. We further propose a data-based approach to mitigate such failures at their source, by isolating training examples that drive a targeted bias. The second thrust scrutinizes the role of pre-training data in the transfer learning setting. Specifically, we investigate the problem of "bias transfer", where biases from the pre-training data can cause reliability failures in downstream deployments. We then introduce a framework for pinpointing the impact of pre-training examples on downstream predictions, enabling us to identify (and remove) detrimental points from the pre-training dataset.