Systems-Aware Optimization for Machine Learning at Scale

Speaker

Virginia Smith

UC Berkeley

Host

Stefanie Jegelka

Abstract: New computing systems have emerged in response to the increasing size and complexity of modern datasets. For best performance, machine learning methods must be designed to closely align with the underlying properties of these systems. In this talk, we illustrate the impact of systems-aware machine learning in the distributed setting, where communication remains the most significant bottleneck. We propose a general optimization framework, CoCoA, that uses local computation in a primal-dual setting to allow for a tunable, problem-specific communication scheme. Our resulting framework enjoys strong convergence guarantees and exhibits state-of-the-art empirical performance in the distributed setting. We demonstrate this performance with extensive experiments in Apache Spark, achieving speedups of up to 50x compared to leading distributed methods for common machine learning objectives.

Bio: Virginia Smith is a 5th-year Ph.D. student in the EECS Department at UC Berkeley, where she works jointly with Michael I. Jordan and David Culler as a member of the AMPLab. Her research interests are in large-scale machine learning and optimization, with a particular interest in applications that relate to energy and sustainability. She is actively working to improve diversity in computer science, most recently by co-founding the Women in Technology Leadership Round Table (WiT). Virginia has won several awards and fellowships while at Berkeley, including the NSF Graduate Research Fellowship, Google Anita Borg Memorial Scholarship, NDSEG Fellowship, MLConf Industry Impact Award, and Berkeley's Tong Leong Lim Pre-Doctoral Prize.

Add to Calendar 2017-03-23 15:00:00 2017-03-23 16:00:00 America/New_York Systems-Aware Optimization for Machine Learning at Scale Abstract: New computing systems have emerged in response to the increasing size and complexity of modern datasets. For best performance, machine learning methods must be designed to closely align with the underlying properties of these systems. In this talk, we illustrate the impact of systems-aware machine learning in the distributed setting, where communication remains the most significant bottleneck. We propose a general optimization framework, CoCoA, that uses local computation in a primal-dual setting to allow for a tunable, problem-specific communication scheme. Our resulting framework enjoys strong convergence guarantees and exhibits state-of-the-art empirical performance in the distributed setting. We demonstrate this performance with extensive experiments in Apache Spark, achieving speedups of up to 50x compared to leading distributed methods for common machine learning objectives.Bio: Virginia Smith is a 5th-year Ph.D. student in the EECS Department at UC Berkeley, where she works jointly with Michael I. Jordan and David Culler as a member of the AMPLab. Her research interests are in large-scale machine learning and optimization, with a particular interest in applications that relate to energy and sustainability. She is actively working to improve diversity in computer science, most recently by co-founding the Women in Technology Leadership Round Table (WiT). Virginia has won several awards and fellowships while at Berkeley, including the NSF Graduate Research Fellowship, Google Anita Borg Memorial Scholarship, NDSEG Fellowship, MLConf Industry Impact Award, and Berkeley's Tong Leong Lim Pre-Doctoral Prize. Seminar Room G575

Organizer & Contact

Stefanie S. Jegelka

stefje@csail.mit.edu

Part of

Machine Learning Seminar Series

Systems-Aware Optimization for Machine Learning at Scale

Speaker

Host

March 23 2017

Location

Organizer & Contact

Part of

March 08

Online Learning for Time Series Prediction

March 15

Measuring Sample Quality with Kernels

Systems-Aware Optimization for Machine Learning at Scale

Speaker

Host

March 23 2017

Location

Organizer & Contact

Part of

Related Events

March 08

Online Learning for Time Series Prediction

March 15

Measuring Sample Quality with Kernels