Automatic Integration and Differentiation of Probabilistic Programs

Speaker

MIT CSAIL

Automatic Integration and Differentiation of Probabilistic Programs
Presenter: Alexander Lew
Thesis Supervisors: Vikash K. Mansinghka and Joshua B. Tenenbaum

Date: May 8, 2025
Time: 2:15pm ET
Location: Building 46, room 3310

Please contact alexlew@mit.edu for a Zoom link.

Abstract:

By automating the error-prone math behind deep learning, systems such as TensorFlow and PyTorch have supercharged machine learning research, empowering hundreds of thousands of practitioners to rapidly explore the design space of neural network architectures and training algorithms. In this talk, I will show how new programming language techniques, especially generalizations of automatic differentiation, make it possible to generalize and extend such systems to support probabilistic models. Our automation is implemented as a suite of composable program transformations for integrating, differentiating, and deriving densities of probabilistic programs. These transformations are rigorously proven sound using new semantic techniques for reasoning about expressive probabilistic programs, and static types are employed to ensure important preconditions for soundness, eliminating large classes of implementation bugs. Providing a further boost, our tools can help users correctly implement fast, low-variance, unbiased estimators of integrals, gradients, and probability densities that are too expensive to compute exactly, enabling orders-of-magnitude speedups in downstream optimization and inference algorithms.

To illustrate the value of these techniques, I’ll show how they have helped us experiment with new architectures that could address key challenges with today’s dominant AI models. In particular, I’ll showcase systems we’ve built for (1) auditable reasoning and learning in relational domains, enabling the detection of thousands of errors across millions of Medicare records, and (2) probabilistic inference over large language models, enabling small open models to outperform GPT-4 on several code generation and constrained generation benchmarks.