CSAIL Forum with Prof Yoon Kim: Efficient and Expressive Architectures for Language Modeling

Efficient and Expressive Architectures for Language Modeling

Speaker: Yoon Kim, Assistant Professor, CSAIL 


Tuesday 12:00-1:00 EDT, April 22, 2025 
live stream via Zoom: Registration required

Abstract:

Transformers are the dominant architecture for language modeling (and generative AI more broadly). The attention mechanism in Transformers is considered core to the architecture and enables accurate sequence modeling at scale. However, the complexity of attention is quadratic in input length, which makes it difficult to apply Transformers to model long sequences. Moreover, Transformers have theoretical limitations when it comes to the class of problems it can solve, which prevents their being able to model certain kinds of phenomena such as state tracking. This talk will describe some recent work on efficient alternatives to Transformers which can overcome these limitations.

Bio: 

Yoon Kim is an assistant professor at MIT EECS and a principal investigator at CSAIL, where he works on natural language processing and machine learning. He obtained his Ph.D. in computer science from Harvard University.