As machine learning rises in popularity, several interesting challenges come to light, many of which belong to an emerging area at the intersection of system design and machine learning. One of these challenges relates to scaling machine learning to ever larger datasets and ever more complicated models. Often, this means re-implementing known machine learning algorithms on new systems that are designed to scale. In some cases, such as training neural networks on graphics processing units (GPU), such an approach works quite well. Unfortunately, in some others like probabilistic modeling, this amounts to trying to fit a square peg in a round hole, and the results are less than satisfactory. In this talk, I will advocate for an approach to scalable machine learning where we co-design the machine learning algorithm and the supporting system. I will consider two examples.
First, I will show an example where we had to re-design a machine learning algorithm to make better use of a GPU. I will present a novel statistical inference algorithm that is both data-parallel and sparse and can be used to learn the latent Dirichlet allocation (LDA) topic model. Our algorithm is statistically as effective as other inference algorithms for LDA but fits the computational model of the GPU.
Second, I will show an example where we extended the capabilities of a system, an implementation of the Message Passing Interface (MPI), to support better scalability for an important class of models. I will present the concept of approximate counting, explain how it can help scale learning algorithms, and how we introduced it in MPI. Working at the intersection of system design and machine learning uncovered a lot of interesting questions and problems about approximate counters related to their addition and synchronization, which I will address.
Jean-Baptiste Tristan is a principal member of technical staff at Oracle labs where he leads a machine learning research group. Previously, he was a postdoctoral fellow at Harvard University, he obtained a Ph.D. in computer science from the French Institute for Research in computer science and Automation (INRIA), and studied computer science at the Ecole Normale Superieure of Paris. He is a senior member of the ACM and a recipient of the 2011 "La recherche" award in information sciences. For the past five years, he has been active in research on statistical machine learning, especially scalable machine learning algorithms and systems.