Failures of Gradient-Based Deep Learning

Speaker

Ohad Shamir
Weizman Institute of Science

Host

Stefanie Jegelka
MIT CSAIL
In recent years, deep learning has become the go-to solution for a broad range of applications, with a long list of success stories. However, it is important, for both theoreticians and practitioners, to also understand the associated difficulties and limitations. In this talk, I'll describe several simple problems for which commonly-used deep learning approaches either fail or suffer from significant difficulties, even if one is willing to make strong distributional assumptions. We illustrate these difficulties empirically, and provide theoretical insights explaining their source and (sometimes) how they can be remedied.

Includes joint work with Shai Shalev-Shwartz and Shaked Shammah.

Ohad Shamir is a faculty member in the Department of Computer Science and Applied Mathematics at the Weizmann Institute of Science, Israel. He received a PhD in computer science from the Hebrew University in 2010, advised by Prof. Naftali Tishby. Between 2010-2013 he was a postdoctoral and associate researcher at Microsoft Research. His research focuses on machine learning, with emphasis on algorithms which combine practical efficiency and theoretical insight. He is also interested in the many intersections of machine learning with related fields, such as optimization, statistics, theoretical computer science and AI.