Does Learning Require Memorization? A Short Tale about a Long Tail

Speaker

Vitaly Feldman

Google Research

Host

Alexander Rakhlin

Brain and Cognitive Sciences Department - MIT

Abstract:
Learning algorithms based on deep neural networks are well-known to (nearly) perfectly fit the training set and fit well even the random labels. The reasons for this tendency to memorize the labels of the training data are not well understood.

We provide a simple model for prediction problems in which such memorization is necessary for achieving close-to-optimal generalization error. In our model, data is sampled from a mixture of subpopulations and the frequencies of these subpopulations are chosen from some prior. Our analysis demonstrates that memorization becomes necessary whenever the frequency prior is long-tailed. Image and text data are known to follow such distributions and therefore our results establish a formal link between these empirical phenomena. We complement the theoretical results with experiments on several standard benchmarks showing that memorization is an essential part of deep learning.

Based on https://arxiv.org/abs/1906.05271 and an ongoing work with Chiyuan Zhang.

Bio:
Vitaly Feldman is a research scientist at Google working on design and theoretical analysis of machine learning algorithms. His recent research interests include stability-based and information-theoretic tools for analysis of generalization, privacy-preserving learning, and adaptive data analysis. Vitaly holds a PhD from Harvard (2006) and was previously a research scientist at IBM Research - Almaden (2007-2017). He serves as a director on the steering committee of the Association for Computational Learning and was a program co-chair for COLT 2016.

Add to Calendar 2019-10-03 16:00:00 2019-10-03 17:00:00 America/New_York Does Learning Require Memorization? A Short Tale about a Long Tail Abstract:Learning algorithms based on deep neural networks are well-known to (nearly) perfectly fit the training set and fit well even the random labels. The reasons for this tendency to memorize the labels of the training data are not well understood.We provide a simple model for prediction problems in which such memorization is necessary for achieving close-to-optimal generalization error. In our model, data is sampled from a mixture of subpopulations and the frequencies of these subpopulations are chosen from some prior. Our analysis demonstrates that memorization becomes necessary whenever the frequency prior is long-tailed. Image and text data are known to follow such distributions and therefore our results establish a formal link between these empirical phenomena. We complement the theoretical results with experiments on several standard benchmarks showing that memorization is an essential part of deep learning.Based on https://arxiv.org/abs/1906.05271 and an ongoing work with Chiyuan Zhang.Bio:Vitaly Feldman is a research scientist at Google working on design and theoretical analysis of machine learning algorithms. His recent research interests include stability-based and information-theoretic tools for analysis of generalization, privacy-preserving learning, and adaptive data analysis. Vitaly holds a PhD from Harvard (2006) and was previously a research scientist at IBM Research - Almaden (2007-2017). He serves as a director on the steering committee of the Association for Computational Learning and was a program co-chair for COLT 2016. 32-G449 (Stata Center - Patil/Kiva Conference Room)

Organizer & Contact

Marcia G. Davidson

marcia@csail.mit.edu

617-253-3049

Part of

Machine Learning Seminar Series 2019

Does Learning Require Memorization? A Short Tale about a Long Tail

Speaker

Host

October 03 2019

Location

Organizer & Contact

Part of

December 05

The Non-Stochastic Control Problem

October 29

David Spiegelhalter: Communicating uncertainty about facts, numbers and science

Does Learning Require Memorization? A Short Tale about a Long Tail

Speaker

Host

October 03 2019

Location

Organizer & Contact

Part of

Related Events

December 05

The Non-Stochastic Control Problem

October 29

David Spiegelhalter: Communicating uncertainty about facts, numbers and science