Is efficient data analysis inherently non-robust?
Speaker: Moritz Hardt, IBM
Date: Wednesday, January 23 2013
Time: 4:15PM to 5:15PM
Host: Shafi Goldwasser, MIT CSAIL
Contact: Holly Jones, 617-253-6098, firstname.lastname@example.orgRelevant URL:
Abstract: Computational efficiency is the central concern in the design of algorithms for massive data sets. Yet an algorithm can be impractical not due to lack of efficiency, but rather lack of robustness to errors, outliers and adversarial conditions. Can we design algorithms that are both efficient and robust? In this talk we explore this question in two fundamental settings giving both new algorithms and impossibility results.
We first consider linear sketches; a powerful algorithmic tool widely used in data streams, compressed sensing, and distributed computing. In realistic settings, a linear sketch faces the possibility that its inputs are correlated with previous outputs of the sketch. Known results do not guarantee the correctness of the output in the presence of such correlations. We argue that this situation is inherent by showing that for a range of basic problems no linear sketch can guarantee correctness on a polynomial number of adaptively chosen inputs.
In the second part of the talk we consider the fundamental problem of discovering hidden linear structure in a data set in the presence of adversarial outliers. We give new algorithms for these problems and show that the fraction of outliers they can handle is essentially best possible assuming the Small Set Expansion Hypothesis.
*Based on joint works with David P. Woodruff and Ankur Moitra.
See other events happening in January 2013