Models That Prove Their Own Correctness
Host
Rahul Ilango
MIT
Abstract:
This talk introduces Self-Proving models, a new class of models that formally prove the correctness of their outputs via an Interactive Proof system. After reviewing some related literature, I will formally define Self-Proving models and their per-input (worst-case) guarantees. I will then present algorithms for learning these models and explain how the complexity of the proof system affects the complexity of the learning algorithms. Finally, I will show experiments where Self-Proving models are trained to compute the Greatest Common Divisor of two integers, and to prove the correctness of their results to a simple verifier.
No prior knowledge of autoregressive models or Interactive Proofs will be assumed of the listener. This is a joint work with Noga Amit, Shafi Goldwasser, and Guy Rothblum.
This talk introduces Self-Proving models, a new class of models that formally prove the correctness of their outputs via an Interactive Proof system. After reviewing some related literature, I will formally define Self-Proving models and their per-input (worst-case) guarantees. I will then present algorithms for learning these models and explain how the complexity of the proof system affects the complexity of the learning algorithms. Finally, I will show experiments where Self-Proving models are trained to compute the Greatest Common Divisor of two integers, and to prove the correctness of their results to a simple verifier.
No prior knowledge of autoregressive models or Interactive Proofs will be assumed of the listener. This is a joint work with Noga Amit, Shafi Goldwasser, and Guy Rothblum.