Potential COVID-19 vaccines get a boost from machine learning

Peptides for corona vaccine

After what has started to feel like a boundless eternity of wearing masks, bathing in hand sanitizer, and dodging people in the grocery store, many of us have been left thinking: what would a COVID-19 vaccine look like? 

Different approaches to the challenge have looked at targeting the so-called “spike proteins” that cover the virus and help it invade human cells. 

Whole virus, DNA, and RNA vaccine platforms have been explored using a range of techniques, all in the hopes of creating immunity and changing the unpredictable trajectory of the novel virus. 

Recently, a team of researchers from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) took a new approach to getting us closer to a solution: a combinatorial machine learning system that selects peptides (short strings of amino acids) that are predicted to provide high population coverage for a vaccine. 

The design system, called “OptiVax,” introduces methods for designing new peptide vaccines, evaluating existing vaccines, and augmenting existing vaccine designs. In this system, peptides are scored through machine learning by their ability to be displayed to elicit an immune response, and are then selected to maximize population coverage of who could benefit from the vaccine. 

“We evaluated a common vaccine design based on the spike protein for COVID-19 that is currently in multiple clinical trials,” says Ge Liu and Brandon Carter, CSAIL PhD students and lead authors on a new paper about OptiVax. “Based on our analysis, we developed an augmentation to improve its population coverage by adding peptides. If this works in animal models, the design could move to human clinical trials.” 


In building out their system, the team first adjusted their predictive models and used multiple models to design a vaccine. 

Taking into consideration the vast differences in our individual DNA, the researchers paid close attention to the genetic makeup of different populations, to maximize the likelihood that people with uncommon genes would still be covered by the vaccine. 

Armed with this, they created OptiVax. 

OptiVax works by identifying all possible peptide fragments from a set of viral or tumor proteins that would be good candidates for a vaccine. 

Then, peptides are scored for selection on multiple criteria, including their observed mutation rate across nearly 5,000 geographically sampled genomes. Because these peptide fragments stem from the virus, administering them in a vaccine can lead to immunity.

OptiVax then designs a vaccine from these candidates to maximize population coverage in different geographical regions, and from the number of peptides displayed per individual to improve the chances the person will become immune.

The team then used "EvalVax," a complementary system they designed that predicts coverage for vaccines, to evaluate 29 different vaccine designs by others. They found that many of them were not predicted to provide high population coverage.

“One of the challenges here was assembling good data on how people differ in their genetic makeup, in key genes that control the response to a vaccine or viral infection,” says MIT professor David Gifford. “And then, we had to solve a difficult optimization problem to design a vaccine with good population coverage.” 

Future work 

Once animal testing of their vaccine design is done, the team says they can then evaluate if a clinical trial is warranted.  This, they note, will also depend upon the efficacy of the first set of vaccines already being clinically tested.

One of the wild cards of COVID-19 is the inability to predict how different individuals will respond --- from minor symptoms to fatal cases. 

With that in mind, the researchers are working with a team at the National Institute of Health (NIH) to see if their methods can be used for risk prediction using data from COVID-19 patients.

The team notes that their framework could be used to design vaccines for a wide range of infectious diseases, and hope to apply it to other viral infections in the future. 

Liu, Carter, and Gifford wrote the paper alongside CSAIL postdoc Siddhartha Jain, Trenton Bricken of Duke University, Mathias Viard and Mary Carrington of Basic Science Program, Frederick National Laboratory for Cancer Research and the Ragon Institute of MGH, MIT, and Harvard.

This work was supported in part by Schmidt Futures and a NIH grant.