Sublinear Algorithms for String Compressibility and the Distribution Support Size

Speaker: Sofya Raskhodnikova , Weizmann Institute
Date: March 16 2006
Time: 4:00PM to 5:15PM
Location: 32-G575
Host: Madhu Sudan, MIT CSAIL
Contact: Joanne Hanley, 617.253.6054, joanne@csail.mit.edu
Relevant URL: http://theory.csail.mit.edu/~madhu/algcomp/sofya-abs.htmlAbstract:
Imagine having to choose between a few compression schemes to compress a very long file. Before deciding on the scheme, you might want to obtain a rough estimate on how well each scheme performs on your file. We consider the question of approximating compressibility of a string with respect to a fixed compression scheme, in sublinear time.
In the talk, we will concentrate on the run-length encoding and a version of Lempel-Ziv as our example compression schemes. We present algorithms and lower bounds for approximating compressibility with respect to both schemes. We show that compressibility with respect to Lempel-Ziv is related to approximating the support size of a distribution. This problem has been considered under different guises in the literature. We prove a lower bound for it, at the heart of which is a construction of two positive integer random variables, X and Y, with very different expectations and the following condition on the moments up to k:
E[X]/E[Y] = E[X^2]/E[Y^2] = ... = E[X^k]/E[Y^k].
Joint work with Dana Ron, Ronitt Rubinfeld, Amir Shpilka and Adam Smith.
See other events that are part of Algorithms and Complexity Spring 2006
See other events happening in March 2006