A New Paradigm for Learning with Distribution Shift

We revisit the fundamental problem of learning with distribution shift, where a learner is given labeled samples from training distribution D, unlabeled samples from test distribution D′ and is asked to output a classifier with low test error. The standard approach in this setting is to prove a generalization bound in terms of some notion of distance between D and D′. These distances, however, are difficult to compute, and this has been the main stumbling block for efficient algorithm design over the last two decades.
We sidestep this issue and define a new model called TDS learning, where a learner runs a test on the training set and is allowed to reject if this test detects distribution shift relative to a fixed output classifier. This approach leads to the first set of efficient algorithms for learning with distribution shift that do not take any assumptions on the test distribution. Finally, we discuss how our techniques have recently been used to solve longstanding problems for supervised learning with contamination.