Which Algorithms Have Tight Generalization Bounds?

Speaker

EPFL

Host

Justin Chen, Lily Chung, John Kuszmaul
CSAIL, EECS

Generalization measures (GMs) are a central tool for providing performance guarantees and informing algorithmic design. Yet, most such bounds are known to be loose (or even vacuous) in practise. In a series of works [1, 2], we focus on GMs in the overparametrized supervised learning setting, where we show that this looseness is unavoidable due to fundamental lower bounds. Notably, these lower bounds hold on average over finite collections of distributions, and with numerically appreciable values.

For GMs that are computed solely from the training sample but depend on neither the algorithm nor distribution, we show non-tightness across a large fraction of (algorithm, distribution) combinations.

For GMs that can also depend on the algorithm, we show that there can be a trade-off between the algorithm’s ability to learn and the ability to verify the algorithm’s learning success with a GM.

Next, we study algorithm-dependent GMs for algorithms that admit a natural notion of algorithmic implicit bias. There, non-tightness of GMs provably occurs whenever the underlying distribution class is rich enough, which is the case for example when learning VC-classes.

Lastly, we show that a certain notion of algorithmic stability is sufficient for the existence of tight GMs.

Joint work with Ido Nachum (University of Haifa), Jonathan Shafer (MIT), and Michael Gastpar (EPFL).

[1]: ICLR 2024, see https://arxiv.org/abs/2309.13658
[2]: Neurips 2025 (Spotlight), see https://arxiv.org/abs/2410.01969