H-Nets: Dynamic Chunking for End-to-End Hierarchical Sequence Modeling

Speaker

Cartesia AI (https://cartesia.ai/)

Host

MIT CSAIL, AI@MIT Reading Group

RSVP here: https://forms.gle/Es9S5VtgiGA8fvkd9

Dynamic chunking (arXiv:2507.07955) is a recently introduced method to train end-to-end hierarchical sequence models (H-Nets). H-Nets model sequences while explicitly chunking them into higher-order concepts, thus allowing them to train on raw UTF-8 bytes without the need of tokenization, and also allowing them to learn sparser language representations than subword tokens. On language modeling tasks, H-Nets show superior performance to a standard tokenized transformer baseline, while exhibiting similar performance to a much larger transformer. H-Nets also show superior performance on various other language modeling tasks (Chinese and code), while learning data-dependent and context-aware chunking schemes. In this talk, we discuss the methodology and ideas behind H-Net, and share some motivation and context for the work.

Add to Calendar 2025-10-15 18:00:00 2025-10-15 19:00:00 America/New_York H-Nets: Dynamic Chunking for End-to-End Hierarchical Sequence Modeling RSVP here: https://forms.gle/Es9S5VtgiGA8fvkd9  Dynamic chunking (arXiv:2507.07955) is a recently introduced method to train end-to-end hierarchical sequence models (H-Nets). H-Nets model sequences while explicitly chunking them into higher-order concepts, thus allowing them to train on raw UTF-8 bytes without the need of tokenization, and also allowing them to learn sparser language representations than subword tokens. On language modeling tasks, H-Nets show superior performance to a standard tokenized transformer baseline, while exhibiting similar performance to a much larger transformer. H-Nets also show superior performance on various other language modeling tasks (Chinese and code), while learning data-dependent and context-aware chunking schemes. In this talk, we discuss the methodology and ideas behind H-Net, and share some motivation and context for the work. TBD

Organizer & Contact

Zhening Li

zli11010@mit.edu

Part of

AI@MIT Reading Group

H-Nets: Dynamic Chunking for End-to-End Hierarchical Sequence Modeling

Speaker

Host

October 15 2025

Location

Organizer & Contact

Part of

October 08

Output Supervision Can Obfuscate the Chain of Thought

October 01

AI Reasoning at Scale with Search

H-Nets: Dynamic Chunking for End-to-End Hierarchical Sequence Modeling

Speaker

Host

October 15 2025

Location

Organizer & Contact

Part of

Related Events

October 08

Output Supervision Can Obfuscate the Chain of Thought

October 01

AI Reasoning at Scale with Search