Optimal Quantile Estimation: Beyond the Comparison Model

Speaker

UC Berkeley

Host

Noah Golowich
MIT
Abstract: Estimating quantiles is one of the most basic problems in data sketching. In this problem, a stream x1, x2, x3, …, xn of elements from some universe of size U, a rank r, and an accuracy ε are given. The goal is to give a space-efficient algorithm that outputs an element with rank between r-εn and r+εn. For example, this captures median and 99th percentile estimation. Unsurprisingly, a quantile sketch can be made more space-efficient than storing every element individually (which would take nlogU memory). In this talk, we’ll describe a deterministic quantile sketch using O(ε-1·(logn+logU)) bits of memory. This is optimal and improves upon the previous best deterministic algorithm (Greenwald and Khanna 2003) and even upon the best randomized algorithm (Karnin, Lang, and Libery 2016). This is joint work with Mihir Singhal and Hongxun Wu.