CSAIL Event Calendar: Previous Series

Near-Optimal Hashing Algorithm for the Approximate Nearest Neighbor Problem

Speaker: Piotr Indyk , Massachusetts Institute of Technology
Date: February 14 2006
Time: 4:15PM to 5:15PM
Location: 32-155
Host: Madhu Sudan, Massachusetts Institute of Technology

Contact: Kevin Matulef, 3-5883, matulef@mit.edu
Relevant URL: http://theory.lcs.mit.edu/theory-seminars/

The nearest neighbor problem is defined as follows: given a set of n data points in a d-dimensional space, construct a data structure which, given a query point q, returns the data point closest to the query.

The problem is of importance in multiple areas of computer science. Nearest neighbor algorithms can be used for: classification (in machine learning), similarity search (in biological, text or visual databases), data quantization (in signal processing and compression), etc. Most of these applications involve high-dimensional data. Unfortunately, classic algorithms for geometric search problems, such as kd-trees, do not scale well as the dimension increases.

In recent years, there has been extensive research done on *approximate* variants of the nearest neighbor problem. One of the algorithms, based on the idea of Locality-Sensitive Hashing (LSH), has been successfully used in several applied scenarios.

In this talk I will review that work, as well as describe recent developments in this area. In particular, I will show a new LSH-based algorithm, whose running time significantly improves over the earlier bounds. The running time of the new algorithm is provably close to the best possible in the class of hashing-based algorithms.

-----------------------------------
This is joint work with Alex Andoni (MIT).

See other events that are part of Theory Colloquium Spring 2006

See other events happening in February 2006


About Us Research News Resources Directory