Nonvacuous Generalization Bounds for Deep Neural Networks via PAC-Bayes

Speaker

Daniel Roy

University of Toronto

Host

David Sontag

MIT CSAIL

Abstract:

A serious impediment to a rigorous understanding of the generalization performance of algorithms like SGD for neural networks is that most generalization bounds are numerically vacuous when applied to modern networks on real data sets. In recent work (Dziugaite and Roy, UAI 2017), we argue that it is time to revisit the problem of computing nonvacuous bounds, and show how the empirical phenomenon of "flat minima" can be operationalized using PAC-Bayesian bounds, yielding the first nonvacuous bounds for a large (stochastic) neural network on MNIST. The bound is obtained by first running SGD and then optimizing the distribution of a random perturbation of the weights so as to capture the flatness and minimize the PAC-Bayes bound. I will describe this work, its antecedents, its goals, and subsequent work, focusing on where others have and have not made progress towards understanding generalization according to our strict criteria.

Joint work with Gintare Karolina Dziugaite based on https://arxiv.org/abs/1703.11008, https://arxiv.org/abs/1712.09376, and https://arxiv.org/abs/1802.09583

Bio:

Daniel Roy is an Assistant Professor in the Department of Statistical Sciences and, by courtesy, Computer Science at the University of Toronto, and a founding faculty member of the Vector Institute for Artificial Intelligence. Daniel is a recent recipient of an Ontario Early Researcher Award and Google Faculty Research Award. Before joining U of T, Daniel held a Newton International Fellowship from the Royal Academy of Engineering and a Research Fellowship at Emmanuel College, University of Cambridge. Daniel earned his S.B., M.Eng., and Ph.D. from the Massachusetts Institute of Technology: his dissertation on probabilistic programming won an MIT EECS Sprowls Dissertation Award. Daniel's group works on foundations of machine learning and statistics.

Add to Calendar 2018-04-25 16:30:00 2018-04-25 17:30:00 America/New_York Nonvacuous Generalization Bounds for Deep Neural Networks via PAC-Bayes Abstract:A serious impediment to a rigorous understanding of the generalization performance of algorithms like SGD for neural networks is that most generalization bounds are numerically vacuous when applied to modern networks on real data sets. In recent work (Dziugaite and Roy, UAI 2017), we argue that it is time to revisit the problem of computing nonvacuous bounds, and show how the empirical phenomenon of "flat minima" can be operationalized using PAC-Bayesian bounds, yielding the first nonvacuous bounds for a large (stochastic) neural network on MNIST. The bound is obtained by first running SGD and then optimizing the distribution of a random perturbation of the weights so as to capture the flatness and minimize the PAC-Bayes bound. I will describe this work, its antecedents, its goals, and subsequent work, focusing on where others have and have not made progress towards understanding generalization according to our strict criteria.Joint work with Gintare Karolina Dziugaite based on https://arxiv.org/abs/1703.11008, https://arxiv.org/abs/1712.09376, and https://arxiv.org/abs/1802.09583Bio:Daniel Roy is an Assistant Professor in the Department of Statistical Sciences and, by courtesy, Computer Science at the University of Toronto, and a founding faculty member of the Vector Institute for Artificial Intelligence. Daniel is a recent recipient of an Ontario Early Researcher Award and Google Faculty Research Award. Before joining U of T, Daniel held a Newton International Fellowship from the Royal Academy of Engineering and a Research Fellowship at Emmanuel College, University of Cambridge. Daniel earned his S.B., M.Eng., and Ph.D. from the Massachusetts Institute of Technology: his dissertation on probabilistic programming won an MIT EECS Sprowls Dissertation Award. Daniel's group works on foundations of machine learning and statistics. 32-141 (Stata Center, 1st floor)

Organizer & Contact

Marcia G. Davidson

marcia@csail.mit.edu

617-253-3049

Part of

Machine Learning Seminar Series 2018

Nonvacuous Generalization Bounds for Deep Neural Networks via PAC-Bayes

Speaker

Host

April 25 2018

Location

Organizer & Contact

Part of

October 10

Is Q-learning Provably Efficient?

November 28

Optimal Algorithms for Continuous Non-monotone Submodular Maximization

Nonvacuous Generalization Bounds for Deep Neural Networks via PAC-Bayes

Speaker

Host

April 25 2018

Location

Organizer & Contact

Part of

Related Events

October 10

Is Q-learning Provably Efficient?

November 28

Optimal Algorithms for Continuous Non-monotone Submodular Maximization