Improving and Proving: Reinforcement Learning for Proof Shortening and Hierarchical Theorem Proving in Lean

Speaker

MIT CSAIL

Host

MIT CSAIL

RSVP here: https://forms.gle/thXsAVZkNUy6Bgwt6

 

Reinforcement learning (RL) has recently emerged as a powerful driver of progress in formal theorem proving, enabling near–IMO-level performance. However, it also encouraged models to produce long, redundant, and opaque proofs. In this talk, I will present two complementary ways RL can enhance automated reasoning in Lean. First, I will introduce ProofOptimizer, which applies RL to fine-tune a model for optimizing these overcomplicated proofs.  Combining symbolic linting, a 7B model trained with expert iteration, and iterative refinement, ProofOptimizer reduces proof length by up to 87% on MiniF2F and 57% on PutnamBench. Second, I will show how coupling online RL with hierarchical (lemma-based) inference yields more scalable theorem proving. RL successfully trains proof decomposition policies, while hierarchical inference improves exploration and data efficiency. Together, we illustrate how RL can be applied to both improve existing proofs and prove new theorems more effectively.