Learning complex functional constraints in proteins and non-coding DNA

Speaker

Yun Song

Berkeley

Host

Bonnie Berger

CSAIL MIT

Abstract: Predicting the impact of genetic variants is a significant challenge in computational biology, with crucial applications in disease diagnosis, gene regulation modeling, and protein engineering. In this talk, I will describe my lab's recent work on improving variant effect prediction for both coding and non-coding regions by leveraging advances in unsupervised learning, particularly self-supervised learning from natural language processing. For coding variants, I will introduce a robust learning framework to transfer properties between unrelated proteins and discuss how this approach fares in comparison with the recently published PrimateAI-3D and AlphaMissense methods. Regarding non-coding variants, I will present our work on DNA language models, highlighting their efficacy in genome-wide variant effect prediction.

Bio: Yun Song is a Professor of EECS and Statistics at the University of California, Berkeley, where he has been since 2007. He was originally trained in mathematics and theoretical physics, with degrees from MIT and Stanford University. He transitioned into population genetics during his postdoc at the University of Oxford, and over the past 20 years he has been carrying out interdisciplinary research on diverse computational biology problems. His awards and honors include NIH Pathway to Independence Award (K99/R00), NSF CAREER Award, Alfred P. Sloan Research Fellowship, Packard Fellowship for Science and Engineering, and Chan Zuckerberg Biohub Investigator Award.

Zoom link: https://mit.zoom.us/j/93513735220

Location: 32 G-575

Refreshments will be available

Add to Calendar 2023-12-13 11:30:00 2023-12-13 13:00:00 America/New_York Learning complex functional constraints in proteins and non-coding DNA Abstract: Predicting the impact of genetic variants is a significant challenge in computational biology, with crucial applications in disease diagnosis, gene regulation modeling, and protein engineering. In this talk, I will describe my lab's recent work on improving variant effect prediction for both coding and non-coding regions by leveraging advances in unsupervised learning, particularly self-supervised learning from natural language processing. For coding variants, I will introduce a robust learning framework to transfer properties between unrelated proteins and discuss how this approach fares in comparison with the recently published PrimateAI-3D and AlphaMissense methods. Regarding non-coding variants, I will present our work on DNA language models, highlighting their efficacy in genome-wide variant effect prediction.Bio: Yun Song is a Professor of EECS and Statistics at the University of California, Berkeley, where he has been since 2007. He was originally trained in mathematics and theoretical physics, with degrees from MIT and Stanford University. He transitioned into population genetics during his postdoc at the University of Oxford, and over the past 20 years he has been carrying out interdisciplinary research on diverse computational biology problems. His awards and honors include NIH Pathway to Independence Award (K99/R00), NSF CAREER Award, Alfred P. Sloan Research Fellowship, Packard Fellowship for Science and Engineering, and Chan Zuckerberg Biohub Investigator Award.Zoom link: https://mit.zoom.us/j/93513735220Location: 32 G-575Refreshments will be available 32 G-575

Organizer & Contact

Shuvom Sadhuka

ssadhuka@mit.edu

Part of

Bioinformatics Seminar Series 2023

Learning complex functional constraints in proteins and non-coding DNA

Speaker

Host

December 13 2023

Location

Organizer & Contact

Part of

September 11

Multimodal Protein Foundation Models

October 02

Bioinformatics Seminar - Prediction potential and pitfalls in pervasive population personal genomics: Interpreting newborn genomes with Notes on privacy timebombs in functional genomics data

Learning complex functional constraints in proteins and non-coding DNA

Speaker

Host

December 13 2023

Location

Organizer & Contact

Part of

Related Events

September 11

Multimodal Protein Foundation Models

October 02

Bioinformatics Seminar - Prediction potential and pitfalls in pervasive population personal genomics: Interpreting newborn genomes with Notes on privacy timebombs in functional genomics data