CSAIL Event Calendar


[A&C Seminar] Learning patterns in Big data from small data using core-sets

Speaker: Dan Feldman, MIT
Date: Thursday, November 29 2012
Time: 4:00PM to 5:00PM
Location: 32-G575
Contact: Eric Price, ecprice@mit.edu
Relevant URL: http://www.mit.edu/~ecprice/algcompsem/

When we need to solve an optimization problem we usually use the best available
algorithm/software or try to improve it. In recent years we have started
exploring a different approach: instead of improving the algorithm, reduce the
input data and run the existing algorithm on the reduced data to obtain the
desired output much faster on a streaming input, using a manageable amount of
memory, and in parallel (say, using Hadoop, cloud service, or GPUs).

A core-set for a given problem is a semantic compression of its input, in the
sense that a solution for the problem with the (small) coreset as input yields
a provable approximate solution to the problem with the original (Big) data.

In this talk I will describe how we applied this magical paradigm to obtain
algorithmic achievements with performance guarantees in iDiary: a system that
combines robotics, sensor networks, computer vision, differential privacy, and
text mining. It turns large signals collected from smart-phones or robots
sensors into textual descriptions of the trajectories. The system features a
user interface similar to Google Search that allows users to type text queries
on their activities (e.g., "Where did I buy books?") and receive textual
answers based on their signals.


Bio:
Dan Feldman is a post-doc at MIT in the Distributed Robotics Lab, where he develops systems for handling
streaming Big data from sensors, smartphones, images, and robots. He got his Ph.D. from Tel-Aviv University in
2010, under the supervision of Prof. Micha Sharir and Prof. Amos Fiat. He then was a postdoc at the Center for
the Mathematics of Information at Caltech for a year and a half, where he started to reduce the gap between
theoretical computational geometry and practical machine learning. He is specialized in developing software for
scalable data compression, based on core-set constructions with provable guarantees. His coresets were
implemented in several start-ups, banks, super-markets, and internet search companies over the recent years, to
name just a few. When he is not working, Dan is building robots with his very own coresets, Ariel and Eleanor.

See other events happening in November 2012


About Us Research News Resources Directory