As data changes in scope and value, the ability to understand different types of data is becoming a key factor for unlocking progress in science and technology. One way to make sense of data—whether in the form of images, networks, measurements, or subsets—is through machine learning. By developing new machine-learning methods with discrete and structured data that makes learning more accurate and efficient, MIT researchers are making machine learning widely applicable, reliable, and robust.
Leading this research is Professor Stefanie Jegelka of the Machine Learning research group in MIT CSAIL. Prof. Jegelka, an X-Consortium Career Development Associate Professor at MIT EECS and a Principal Investigator at CSAIL, first became interested in machine learning when, as an undergraduate student, she had the opportunity to participate in research in computational neuroscience. Afterward, still as an undergraduate, she worked in a research lab at the Max Planck Institute on a project on disentangling mixed signals from different sources, also called Independent Component Analysis. This project involved large matrices and advanced optimization methods, and was the starting point for exploring many more machine-learning questions.
She quickly found that a particularly interesting challenge for machine-learning methods is that data points are often not simply columns of measurements in a table, but come in the form of images, molecules, subsets of preferred items, or members of a social network. To make machine learning applicable to such settings, we need new methods that account for the underlying structure and relationships. The question is how to do this accurately, robustly, and without needing too much time or data.
In order to get closer to solving these questions, Prof. Jegelka continued her research as a PhD student at the Max Planck Institutes in Tuebingen and ETH Zurich, and later as a postdoc at UC Berkeley. Since joining MIT CSAIL, she has taken a deeper dive into deep learning and is developing tools to make faster predictions and subset selections within a large dataset.
Her Machine Learning group focuses on developing fast and efficient machine-learning methods that offer theoretical guarantees. Specifically, this work includes:
Algorithms for subset selection and combinatorial optimization problems in machine learning
Learning with probability distributions over discrete objects
Learning with graphs and networks
Applications in computer vision and the development of new materials
The research in Prof. Jegelka's group shows when and when not machine-learning methods will work in these settings, and, by exploiting specific mathematical structure, the group develops new machine-learning methods for these settings that ultimately enable the use of machine learning in a much wider range of scenarios.
Graph Neural Networks (GNNs) are a powerful framework revolutionizing graph representation learning, but our understanding of their representational properties is limited. This project aims to explore the theoretical foundations of learning with graphs and relations in AI via the GNN architecture.
We aim to understand theory and applications of diversity-inducing probabilities (and, more generally, "negative dependence") in machine learning, and develop fast algorithms based on their mathematical properties.
Data often has geometric structure which can enable better inference; this project aims to scale up geometry-aware techniques for use in machine learning settings with lots of data, so that this structure may be utilized in practice.
Many optimization problems in machine learning rely on noisy, estimated parameters. Neglecting this uncertainty can lead to great fluctuations in performance. We are developing algorithms for these already nonconvex problems that are robust to such errors.
Last month, three MIT materials scientists and their colleagues published a paper describing a new artificial-intelligence system that can pore through scientific papers and extract “recipes” for producing particular types of materials.