CSAIL Event Calendar: Previous Series

[A&C Seminar] Testing Properties of Collections of Distributions: Equivalence and Similar Means

Speaker: Reut Levi , Tel Aviv University
Date: September 22 2011
Time: 4:00PM to 5:00PM
Location: 32-G575
Host: Ronitt Rubinfeld, CSAIL

Contact: Eric Price, ecprice@mit.edu
Relevant URL: http://www.mit.edu/~ecprice/algcompsem/

We propose a framework for studying property testing of collections of distributions, where the number of distributions in the collection is a parameter of the problem. Previous work on property testing of distributions considered single distributions or pairs of distributions. We suggest two models that differ in the way the algorithm is given access to samples from the distributions. In one model the algorithm may ask for a sample from any distribution of its choice, and in the other the distributions from which it gets samples are selected randomly. Our main focus is on the basic problem of distinguishing between the case that all the distributions in the collection are the same (or very similar), and the case that it is necessary to modify the distributions in the collection in a non-negligible manner so as to obtain this property. We give almost tight upper and lower bounds for this testing problem, as well as study an extension to a clusterability property. One of our lower bounds directly implies a lower bound on testing independence of a joint distribution, a result which was left open by previous work.

In a more recent work, we consider another basic problem of testing a basic property of collections of distributions: having similar means. Namely, the algorithm should accept collections of distributions in which all distributions have means that do not differ by more than some given parameter, and should reject collections that are relatively far from having this property. We provide upper and lower bounds in both models. In particular, in the query model, the complexity of the problem is polynomial in $1/\epsilon$ (where $\epsilon$ is the given distance parameter). On the other hand, in the sampling model, the complexity grows roughly as $m^{1-\poly(1/\epsilon)}, where $m$ is the number of distributions.

Joint work with Prof. Dana Ron and Prof. Ronitt Rubinfeld.

See other events that are part of

See other events happening in September 2011


About Us Research News Resources Directory