December 10 '24

CSAIL’s Sara Beery, Marzyeh Ghassemi, and Yoon Kim earn AI2050 Early Career Fellowships

Written By

EECS faculty and CSAIL principal investigators Sara Beery, Marzyeh Ghassemi, and Yoon Kim (Credit: MIT EECS).

Sara Beery, Marzyeh Ghassemi, and Yoon Kim, EECS faculty and CSAIL principal investigators, were awarded AI2050 Early Career Fellowships earlier this week for their pursuit of “bold and ambitious work on hard problems in AI.” They received this honor from Schmidt Futures, Eric and Wendy Schmidt’s philanthropic initiative that aims to accelerate scientific innovation.

Each year, the AI2050 awards fellowships to both up-and-coming and senior researchers for their efforts to help build generally useful AI systems, address human challenges like climate change and global diseases, and use intelligent technology responsibly. Previous recipients include Daniela Rus, Director of CSAIL and MIT EECS Professor, and Joshua Tenenbaum, a professor in the Department of Brain and Cognitive Sciences and CSAIL principal investigator, who were both named Senior Fellows, and Dylan Hadfield-Menell, an MIT EECS professor and CSAIL principal investigator who was named an Early Career Fellow.

Sara Beery: Using AI to scale up environmental monitoring

Beery has focused on building computer vision techniques that help us understand how species and environments are changing globally. To accomplish this, her group monitors biodiversity across data modalities like images, remote sensing data, acoustics, and sonar. Her work addresses core problems like spatiotemporal correlations, imperfect data quality, fine-grained categories, and long-tailed distributions.

Her recent work includes “Tree-D Fusion,” a digital twin system that can identify trees in cities, predict how they'll grow, and measure how they’ll impact their surroundings. Her team merged AI and tree-growth models with Google's Auto Arborist data to create accurate, 3D urban trees, resulting in the first-ever large-scale database of 600,000 environmentally aware, simulation-ready tree models across North America.

Her lab also recently published the “INQUIRE” dataset, which tested the image retrieval skills of vision language models on five million wildlife pictures and 250 search prompts from ecologists and other biodiversity experts. In these evaluations, the researchers found that advanced image-understanding models performed reasonably well on straightforward queries about visual content, but struggled with queries requiring expert knowledge. Beery’s AI2050 Early Career Fellowship will fund work to improve this blind spot, helping her design AI models that can unlock valuable secondary data like biological conditions and behaviors within biodiversity images.

Beery previously earned a BS in electrical engineering and mathematics from Seattle University and a PhD in computing and mathematical sciences from Caltech, where she received the Amori Prize for her dissertation.

Marzyeh Ghassemi: Improving safety and equity in health with machine learning

A primary faculty member at the Laboratory for Information & Decision Systems (LIDS), Ghassemi and her “Healthy ML” research group study how machine learning can be made more robust and applied to improve safety and equity in health. More specifically, they develop models that can efficiently and accurately model events from healthcare data, and investigate best practices for multi-source integration, and learning domain appropriate representations.

Recent projects of hers include a new technique that could help people determine whether to trust an AI model’s predictions, a study showing that the AI models that are most accurate at predicting race and gender from X-ray images also show the biggest “fairness gaps,” and an a more sociotechnical approach to understanding biased medical data that views it as “akin to archaeological artifacts.”

Ghassemi earned her PhD at MIT, her MS in biomedical engineering from Oxford University, and her BS degree in computer science and electrical engineering at New Mexico State University.

Yoon Kim: Enhancing the capabilities and efficiency of large language models

Kim focuses on machine learning and natural language processing. His work specifically hones in on how we can efficiently train and deploy large-scale AI models, understand what language models are capable of doing, and the mechanisms that control and augment neural networks.

Earlier this year, Kim helped develop “Co-LLM,” an algorithm that can pair a general-purpose base large language model (LLM) with a more specialized model and guide them to work together. The system reviews each word (or token) within its response to see where it can call upon a more accurate answer from the expert model, leading to more factual responses.

He also recently examined how LLMs fare with variations of different tasks for things like arithmetic, chess, and answering logical questions. His team found that LLMs excel in familiar scenarios but struggle with unfamiliar ones, highlighting the models' reliance on memorization rather than reasoning.

Kim’s 2050AI Fellowship will fund work toward expanding the capabilities of LLMs, while also reducing their environmental impact and making such tools more broadly accessible. To help accomplish this goal, he intends to continue developing efficient architectures and algorithms, like his recent work on making Transformers more efficient. . This emerging technology could reduce the amount of computing resources used up by transformers — a crucial component of LLMs that enables them to process data.

He earned his PhD in computer science at Harvard University, his MS in Data Science from New York University, his MA in Statistics from Columbia University, and his BA in both Math and Economics from Cornell.