The Structure of Optimal Private Tests for Simple Hypotheses

Speaker

Audra McMillan

Host

Govind Ramnarayan, Quanquan Liu, Sitan Chen, NIkhil Vyas

Abstract: Hypothesis testing plays a central role in statistical inference, and is used in many settings where privacy concerns are paramount. In this talk we’ll address a basic question about privately testing simple hypotheses: given two distributions P and Q, and a privacy level \epsilon, how many i.i.d. samples are needed to distinguish P from Q subject to \epsilon-differential privacy, and what sort of tests have optimal sample complexity? Specifically, we'll characterize this sample complexity up to constant factors in terms of the structure of P and Q and the privacy level \epsilon, and show that this sample complexity is achieved by a certain randomized and clamped variant of the log-likelihood ratio test. This result is an analogue of the classical Neyman–Pearson lemma in the setting of private hypothesis testing. The characterization applies more generally to hypothesis tests satisfying essentially any notion of algorithmic stability, which is known to imply strong generalization bounds in adaptive data analysis, and thus our results have applications even when privacy is not a primary concern.

Joint work with Clement Canonne, Gautam Kamath, Adam Smith and Jonathan Ullman.

Add to Calendar 2019-11-20 16:00:00 2019-11-20 17:00:00 America/New_York The Structure of Optimal Private Tests for Simple Hypotheses Abstract: Hypothesis testing plays a central role in statistical inference, and is used in many settings where privacy concerns are paramount. In this talk we’ll address a basic question about privately testing simple hypotheses: given two distributions P and Q, and a privacy level \epsilon, how many i.i.d. samples are needed to distinguish P from Q subject to \epsilon-differential privacy, and what sort of tests have optimal sample complexity? Specifically, we'll characterize this sample complexity up to constant factors in terms of the structure of P and Q and the privacy level \epsilon, and show that this sample complexity is achieved by a certain randomized and clamped variant of the log-likelihood ratio test. This result is an analogue of the classical Neyman–Pearson lemma in the setting of private hypothesis testing. The characterization applies more generally to hypothesis tests satisfying essentially any notion of algorithmic stability, which is known to imply strong generalization bounds in adaptive data analysis, and thus our results have applications even when privacy is not a primary concern.Joint work with Clement Canonne, Gautam Kamath, Adam Smith and Jonathan Ullman. 32-G575

Organizer & Contact

Rebecca Yadegar

ryadegar@csail.mit.edu

Part of

Algorithms and Complexity Seminar 2019-2020

The Structure of Optimal Private Tests for Simple Hypotheses

Speaker

Host

November 20 2019

Location

Organizer & Contact

Part of

December 05

Seth-Hardness of Coding Problems

December 04

Discriminatory and Liberatory Algorithms: Restructuring Algorithmic “Fairness”

The Structure of Optimal Private Tests for Simple Hypotheses

Speaker

Host

November 20 2019

Location

Organizer & Contact

Part of

Related Events

December 05

Seth-Hardness of Coding Problems

December 04

Discriminatory and Liberatory Algorithms: Restructuring Algorithmic “Fairness”