As data changes in scope and value, the ability to understand different types of data is becoming a key factor for unlocking progress in science and technology. One way to make sense of data—whether in the form of images, networks, measurements, or subsets—is through machine learning. By developing new machine-learning methods with discrete and structured data that makes learning more accurate and efficient, MIT researchers are making machine learning widely applicable, reliable, and robust.
Leading this research is Professor Stefanie Jegelka of the Machine Learning research group in MIT CSAIL. Prof. Jegelka, an X-Consortium Career Development Associate Professor at MIT EECS and a Principal Investigator at CSAIL, first became interested in machine learning when, as an undergraduate student, she had the opportunity to participate in research in computational neuroscience. Afterward, still as an undergraduate, she worked in a research lab at the Max Planck Institute on a project on disentangling mixed signals from different sources, also called Independent Component Analysis. This project involved large matrices and advanced optimization methods, and was the starting point for exploring many more machine-learning questions.
She quickly found that a particularly interesting challenge for machine-learning methods is that data points are often not simply columns of measurements in a table, but come in the form of images, molecules, subsets of preferred items, or members of a social network. To make machine learning applicable to such settings, we need new methods that account for the underlying structure and relationships. The question is how to do this accurately, robustly, and without needing too much time or data.
In order to get closer to solving these questions, Prof. Jegelka continued her research as a PhD student at the Max Planck Institutes in Tuebingen and ETH Zurich, and later as a postdoc at UC Berkeley. Since joining MIT CSAIL, she has taken a deeper dive into deep learning and is developing tools to make faster predictions and subset selections within a large dataset.
Her Machine Learning group focuses on developing fast and efficient machine-learning methods that offer theoretical guarantees. Specifically, this work includes:
- Mathematical modeling
- Algorithms for subset selection and combinatorial optimization problems in machine learning
- Learning with probability distributions over discrete objects
- Learning with graphs and networks
- Applications in computer vision and the development of new materials
The research in Prof. Jegelka's group shows when and when not machine-learning methods will work in these settings, and, by exploiting specific mathematical structure, the group develops new machine-learning methods for these settings that ultimately enable the use of machine learning in a much wider range of scenarios.