CSAIL Forum with Prof Yoon Kim: Efficient and Expressive Architectures for Language Modeling

Efficient and Expressive Architectures for Language Modeling

Speaker: Yoon Kim, Assistant Professor, CSAIL

Tuesday 12:00-1:00 EDT, April 22, 2025
live stream via Zoom: Registration required

Abstract:

Transformers are the dominant architecture for language modeling (and generative AI more broadly). The attention mechanism in Transformers is considered core to the architecture and enables accurate sequence modeling at scale. However, the complexity of attention is quadratic in input length, which makes it difficult to apply Transformers to model long sequences. Moreover, Transformers have theoretical limitations when it comes to the class of problems it can solve, which prevents their being able to model certain kinds of phenomena such as state tracking. This talk will describe some recent work on efficient alternatives to Transformers which can overcome these limitations.

Bio:

Yoon Kim is an assistant professor at MIT EECS and a principal investigator at CSAIL, where he works on natural language processing and machine learning. He obtained his Ph.D. in computer science from Harvard University.

Add to Calendar 2025-04-22 12:00:00 2025-04-22 13:00:00 America/New_York CSAIL Forum with Prof Yoon Kim: Efficient and Expressive Architectures for Language Modeling Efficient and Expressive Architectures for Language ModelingSpeaker: Yoon Kim, Assistant Professor, CSAIL Tuesday 12:00-1:00 EDT, April 22, 2025 live stream via Zoom: Registration requiredAbstract:Transformers are the dominant architecture for language modeling (and generative AI more broadly). The attention mechanism in Transformers is considered core to the architecture and enables accurate sequence modeling at scale. However, the complexity of attention is quadratic in input length, which makes it difficult to apply Transformers to model long sequences. Moreover, Transformers have theoretical limitations when it comes to the class of problems it can solve, which prevents their being able to model certain kinds of phenomena such as state tracking. This talk will describe some recent work on efficient alternatives to Transformers which can overcome these limitations.Bio: Yoon Kim is an assistant professor at MIT EECS and a principal investigator at CSAIL, where he works on natural language processing and machine learning. He obtained his Ph.D. in computer science from Harvard University. TBD

Part of

CSAIL Forum

CSAIL Forum with Prof Yoon Kim: Efficient and Expressive Architectures for Language Modeling

April 22 2025

Location

Part of

May 13

CSAIL Forum with Armando Solar-Lezama: Programming the way to better AI

May 06

CSAIL Forum with Manish Raghavan: The role of information diversity in AI systems

CSAIL Forum with Prof Yoon Kim: Efficient and Expressive Architectures for Language Modeling

April 22 2025

Location

Part of

Related Events

May 13

CSAIL Forum with Armando Solar-Lezama: Programming the way to better AI

May 06

CSAIL Forum with Manish Raghavan: The role of information diversity in AI systems