Systems-Aware Optimization for Machine Learning at Scale

Speaker

Virginia Smith
UC Berkeley

Host

Stefanie Jegelka
Abstract: New computing systems have emerged in response to the increasing size and complexity of modern datasets. For best performance, machine learning methods must be designed to closely align with the underlying properties of these systems. In this talk, we illustrate the impact of systems-aware machine learning in the distributed setting, where communication remains the most significant bottleneck. We propose a general optimization framework, CoCoA, that uses local computation in a primal-dual setting to allow for a tunable, problem-specific communication scheme. Our resulting framework enjoys strong convergence guarantees and exhibits state-of-the-art empirical performance in the distributed setting. We demonstrate this performance with extensive experiments in Apache Spark, achieving speedups of up to 50x compared to leading distributed methods for common machine learning objectives.


Bio: Virginia Smith is a 5th-year Ph.D. student in the EECS Department at UC Berkeley, where she works jointly with Michael I. Jordan and David Culler as a member of the AMPLab. Her research interests are in large-scale machine learning and optimization, with a particular interest in applications that relate to energy and sustainability. She is actively working to improve diversity in computer science, most recently by co-founding the Women in Technology Leadership Round Table (WiT). Virginia has won several awards and fellowships while at Berkeley, including the NSF Graduate Research Fellowship, Google Anita Borg Memorial Scholarship, NDSEG Fellowship, MLConf Industry Impact Award, and Berkeley's Tong Leong Lim Pre-Doctoral Prize.