Stefanie Jegelka

Associate Professor



As data changes in scope and value, the ability to understand different types of data is becoming a key factor for unlocking progress in science and technology. One way to make sense of data—whether in the form of images, networks, measurements, or subsets—is through machine learning. By developing new machine-learning methods with discrete and structured data that makes learning more accurate and efficient, MIT researchers are making machine learning widely applicable, reliable, and robust.

Leading this research is Professor Stefanie Jegelka of the Machine Learning research group in MIT CSAIL. Prof. Jegelka, an X-Consortium Career Development Associate Professor at MIT EECS and a Principal Investigator at CSAIL, first became interested in machine learning when, as an undergraduate student, she had the opportunity to participate in research in computational neuroscience. Afterward, still as an undergraduate, she worked in a research lab at the Max Planck Institute on a project on disentangling mixed signals from different sources, also called Independent Component Analysis. This project involved large matrices and advanced optimization methods, and was the starting point for exploring many more machine-learning questions.

She quickly found that a particularly interesting challenge for machine-learning methods is that data points are often not simply columns of measurements in a table, but come in the form of images, molecules, subsets of preferred items, or members of a social network. To make machine learning applicable to such settings, we need new methods that account for the underlying structure and relationships. The question is how to do this accurately, robustly, and without needing too much time or data.

In order to get closer to solving these questions, Prof. Jegelka continued her research as a PhD student at the Max Planck Institutes in Tuebingen and ETH Zurich, and later as a postdoc at UC Berkeley. Since joining MIT CSAIL, she has taken a deeper dive into deep learning and is developing tools to make faster predictions and subset selections within a large dataset.

Her Machine Learning group focuses on developing fast and efficient machine-learning methods that offer theoretical guarantees. Specifically, this work includes:

  1. Mathematical modeling
  2. Algorithms for subset selection and combinatorial optimization problems in machine learning
  3. Learning with probability distributions over discrete objects
  4. Learning with graphs and networks
  5. Applications in computer vision and the development of new materials

The research in Prof. Jegelka's group shows when and when not machine-learning methods will work in these settings, and, by exploiting specific mathematical structure, the group develops new machine-learning methods for these settings that ultimately enable the use of machine learning in a much wider range of scenarios.

Research Areas

Impact Areas



Robust Optimization in Machine Learning and Data Mining

Many optimization problems in machine learning rely on noisy, estimated parameters. Neglecting this uncertainty can lead to great fluctuations in performance. We are developing algorithms for these already nonconvex problems that are robust to such errors.

 1 More


Community of Research

Vertical AI Community of Research

This CoR takes a unified approach to cover the full range of research areas required for success in artificial intelligence, including hardware, foundations, software systems, and applications.