The Lottery Ticket Hypothesis: On Sparse, Trainable Neural Networks

Speaker

Jonathan Frankle

MIT CSAIL

Host

Michael Carbin

MIT CSAIL

In this thesis defense, I will present my work on the "Lottery Ticket Hypothesis," which provides a new perspective on understanding how neural networks learn in practice and how we can make this process more efficient. We have known for decades that it is possible to delete up to 90% of connections from trained neural networks (known as pruning) without any effect on accuracy. In my thesis work, I showed that it is also possible to train such pruned networks from at or near the start, something previous consensus deemed impossible. The takeaway of this finding is that neural networks can successfully learn with far less capacity than we typically provide. This has significant practical and scientific implications. Practically speaking, it sheds light on a new opportunity to dramatically reduce the cost of training the extraordinary models that are increasingly out of reach for all but the best resourced companies. Scientifically speaking, it surprisingly suggests that the capacity necessary for a neural network to learn a function is similar to the capacity necessary to represent it.

I will present the initial work on the Lottery Ticket Hypothesis (ICLR 2019 Best Paper Award), the follow-up work showing how to scale up these findings and providing insights into when and why sparse trainable networks exist (Linear Mode Connectivity and the Lottery Ticket Hypothesis, ICML 2020), and the state of affairs when it comes to exploiting these findings for practical purposes (Pruning Neural Networks at Initialization: Why are we missing the mark?, ICLR 2021). I will close by discussing the implications of this work, including the numerous new research directions it has catalyzed - such as on neural network pruning, efficient training, loss landscape analysis, model averaging for ensembling, and deep learning theory - and the evolution of this empirical approach to understanding and improving deep learning that forms the basis for my startup MosaicML.

Add to Calendar 2022-12-09 13:30:00 2022-12-09 15:00:00 America/New_York The Lottery Ticket Hypothesis: On Sparse, Trainable Neural Networks In this thesis defense, I will present my work on the "Lottery Ticket Hypothesis," which provides a new perspective on understanding how neural networks learn in practice and how we can make this process more efficient. We have known for decades that it is possible to delete up to 90% of connections from trained neural networks (known as pruning) without any effect on accuracy. In my thesis work, I showed that it is also possible to train such pruned networks from at or near the start, something previous consensus deemed impossible. The takeaway of this finding is that neural networks can successfully learn with far less capacity than we typically provide. This has significant practical and scientific implications. Practically speaking, it sheds light on a new opportunity to dramatically reduce the cost of training the extraordinary models that are increasingly out of reach for all but the best resourced companies. Scientifically speaking, it surprisingly suggests that the capacity necessary for a neural network to learn a function is similar to the capacity necessary to represent it.I will present the initial work on the Lottery Ticket Hypothesis (ICLR 2019 Best Paper Award), the follow-up work showing how to scale up these findings and providing insights into when and why sparse trainable networks exist (Linear Mode Connectivity and the Lottery Ticket Hypothesis, ICML 2020), and the state of affairs when it comes to exploiting these findings for practical purposes (Pruning Neural Networks at Initialization: Why are we missing the mark?, ICLR 2021). I will close by discussing the implications of this work, including the numerous new research directions it has catalyzed - such as on neural network pruning, efficient training, loss landscape analysis, model averaging for ensembling, and deep learning theory - and the evolution of this empirical approach to understanding and improving deep learning that forms the basis for my startup MosaicML. D463 (Star)

Organizer & Contact

Nathan Higgins

nhiggins@csail.mit.edu

The Lottery Ticket Hypothesis: On Sparse, Trainable Neural Networks

Speaker

Host

December 09 2022

Location

Organizer & Contact

September 17

ML for drug discovery at Genesis Therapeutics

September 25

Learning, engineering, and targeting cell states in cancer

The Lottery Ticket Hypothesis: On Sparse, Trainable Neural Networks

Speaker

Host

December 09 2022

Location

Organizer & Contact

Related Events

September 17

ML for drug discovery at Genesis Therapeutics

September 25

Learning, engineering, and targeting cell states in cancer