Optimal Quantile Estimation: Beyond the Comparison Model

Speaker

UC Berkeley

Host

Noah Golowich

MIT

Abstract: Estimating quantiles is one of the most basic problems in data sketching. In this problem, a stream x1, x2, x3, …, xn of elements from some universe of size U, a rank r, and an accuracy ε are given. The goal is to give a space-efficient algorithm that outputs an element with rank between r-εn and r+εn. For example, this captures median and 99th percentile estimation. Unsurprisingly, a quantile sketch can be made more space-efficient than storing every element individually (which would take nlogU memory). In this talk, we’ll describe a deterministic quantile sketch using O(ε-1·(logn+logU)) bits of memory. This is optimal and improves upon the previous best deterministic algorithm (Greenwald and Khanna 2003) and even upon the best randomized algorithm (Karnin, Lang, and Libery 2016). This is joint work with Mihir Singhal and Hongxun Wu.

Add to Calendar 2024-05-06 16:00:00 2024-05-06 17:00:00 America/New_York Optimal Quantile Estimation: Beyond the Comparison Model Abstract: Estimating quantiles is one of the most basic problems in data sketching. In this problem, a stream x1, x2, x3, …, xn of elements from some universe of size U, a rank r, and an accuracy ε are given. The goal is to give a space-efficient algorithm that outputs an element with rank between r-εn and r+εn. For example, this captures median and 99th percentile estimation. Unsurprisingly, a quantile sketch can be made more space-efficient than storing every element individually (which would take nlogU memory). In this talk, we’ll describe a deterministic quantile sketch using O(ε-1·(logn+logU)) bits of memory. This is optimal and improves upon the previous best deterministic algorithm (Greenwald and Khanna 2003) and even upon the best randomized algorithm (Karnin, Lang, and Libery 2016). This is joint work with Mihir Singhal and Hongxun Wu. 32-D507

Organizer & Contact

Noah Golowich

nzg@csail.mit.edu

Part of

Algorithms and Complexity (A&C) 2024 - 2025

Optimal Quantile Estimation: Beyond the Comparison Model

Speaker

Host

May 06 2024

Location

Organizer & Contact

Part of

May 14

Catalytic Computing: A Primer

May 01

Understanding the Trade-Offs Between Hallucinations and Mode Collapse in Language Generation

Optimal Quantile Estimation: Beyond the Comparison Model

Speaker

Host

May 06 2024

Location

Organizer & Contact

Part of

Related Events

May 14

Catalytic Computing: A Primer

May 01

Understanding the Trade-Offs Between Hallucinations and Mode Collapse in Language Generation