Peter Szolovits






Peter Szolovits is Professor of Computer Science and Engineering and head of the Clinical Decision-Making Group within CSAIL. He is also an associate member of the MIT Institute for Medical Engineering and Science (IMES) and on the faculty of the Harvard/MIT Health Sciences and Technology program.

His research centers on the application of AI methods to problems of medical decision making, predictive modeling, decision support, and design of information systems for health care institutions and patients. He has worked on problems of diagnosis, therapy planning, execution and monitoring for various medical conditions, computational aspects of genetic counseling, controlled sharing of health information, and privacy and confidentiality issues in medical record systems.

Peter Szolovits' interests in AI include machine learning, natural language processing, knowledge representation, qualitative reasoning, and probabilistic inference. His interests in medical computing include Web-based heterogeneous medical record systems, life-long personal health information systems, and design of cryptographic schemes for health identifiers. He teaches classes in biomedical computing and in computer systems engineering, and has taught artificial intelligence, programming languages, medical desision making, knowledge-based systems and probabilistic inference.

Prof. Szolovits has served as program chairman and on the program committees of national conferences, on the editorial board of several journals, and has been a founder of and consultant for several companies that apply AI to problems of commercial interest. Prof. Szolovits was elected to the  National Academy of Medicine and is a Fellow of the American Association for Artificial Intelligence, the American College of Medical Informatics, the American Institute for Medical and Biological Engineering, and the International Academy of Health Sciences Informatics.

Research Areas

Impact Areas




Quantifying Racial Disparities in End-of-Life Care

When discussing racial disparities in medical treatments, critics often cite social factors as confounders which explain away any differences. Comparing the health of whites to that of non-whites we do see that environmental and social factors conspire to yield higher rates of disease and shorter life spans in non-white populations. But does that really show that medical treatment itself is free from bias? We examine end-of-life care in the ICU, stratified by ethnicity, and controlled for acuity using severity assessment scores. Our analysis agrees with previous studies that nonwhites tend to receive more aggressive (high-risk, high reward) treatments, such as mechanical ventilation than non-whites, despite receiving comparable-or-moderately-less noninvasive treatments. Going further, we show that using treatment patterns and clinical notes, we are able to infer a patient's race. Finally, we show evidence suggesting nonwhite have a much greater distrust of the medical community among than whites do. We find that race, even in the great equalizer of end-of-life care, does continue to influence the treatments administered to a patient.


CliNER: Clinical Concept Extraction

Clinical concept extraction (CCE) of named entities - such as problems, tests, and treatments - aids in forming an understanding of notes and provides a foundation for many downstream clinical decision-making tasks. Historically, this task has been posed as a standard named entity recognition (NER) sequence tagging problem, and solved with feature-based methods using hand-engineered domain knowledge. Recent advances, however, have demonstrated the efficacy of LSTM-based models for NER tasks, including CCE. This work presents CliNER 2.0, a simple-to-install, open-source tool for extracting concepts from clinical text. CliNER 2.0 uses a word- and character- level LSTM model, and achieves state-of-the-art performance. For ease of use, the tool also includes pre-trained models available for public use.


Information Retrieval for Cancer Treatments in Clinical Literature and Trial Eligibility

A "precision medicine" approach for finding relevant cancer treatments in clinical literature and eligible trials. For a given patient with associated demographics (age, gender) and disease (cancer type, genetic variants), we query a database of all pubmed articles and clinicaltrials.gov trials using NLP techniques to find the most useful and relevant treatments for the patient. Our ensemble-based system performed very well in the TREC 2016 Precision Medicine challenge.


Synthetically-Identified Clinical Notes

Clinical notes often describe the most important aspects of a patient's physiology and are therefore critical to medical research. However, these notes are typically inaccessible to researchers without prior removal of sensitive protected health information (PHI), a natural language processing (NLP) task referred to as de-identification. In order to build tools that perform deid, one typically needs the very same data that is private, thus creating a chicken-and-the-egg problem. In this work, we generate "fake" clinical notes where the deidentified information is replaced with real-seeming values (e.g. "Tim Lywood" instead of "George Beveridge") that still respect reasonable distributional semantics. We evaluate models trained on this synthetic data and show that they perform just as well as models trained on the sensitive PHI-bearing notes.

 7 More


Community of Research

Applied Machine Learning Community of Research

This CoR brings together researchers at CSAIL working across a broad swath of application domains. Within these lie novel and challenging machine learning problems serving science, social science and computer science.

Community of Research

Cognitive AI Community of Research

This CoR aims to develop AI technology that synthesizes symbolic reasoning, probabilistic reasoning for dealing with uncertainty in the world, and statistical methods for extracting and exploiting regularities in the world, into an integrated picture of intelligence that is informed by computational insights and by cognitive science.