Set Retrieval 2.0
Speaker: Daniel Tunkelang, Endeca
Date: Tuesday, November 4 2008
Time: 11:00AM to 12:00PM
Refreshments: 10:45AM
Location: Patil Conference Room (32-G449)
Host: Rob Miller, MIT CSAIL
Contact: Rob Miller, x4-6028, rcm@mit.edu
Relevant URL: The earliest information retrieval systems were set retrieval systems,
also known as Boolean retrieval systems because they expected users to
enter queries as Boolean expressions. While set retrieval still survives
in professional search applications, it has been largely supplanted by
best-match or ranked retrieval familiar to anyone who has used web
search.
Best-match retrieval offers several advantages, the most salient being
that it does not require users to be professionally trained. But one of
its significant disadvantages is a loss of transparency. Users make a
leap of faith that the ranking algorithm works, and then resign
themselves to trying again when they are not satisfied with their search
results.
What we need is a retrieval approach that combines the best of both
worlds, providing transparency but not requiring professional training.
We find such an approach in the emerging field of human-computer
information retrieval (HCIR), which conceives information seeking as a
dialogue between the user and the system.
This presentation will outline the principles of information seeking as
a dialogue and walk though concrete examples that illustrate the
principles of HCIR. The foundation is an interactive set retrieval
approach that responds to queries with an overview of the user's current
context and an organized set of options for incremental exploration.
Contextual summaries of document sets optimize system's communication
with user, while query refinement options optimize user's communication
with system.
By enabling bidirectional communication between the user and the system,
we can address the inherent limitations of best-match approaches.
Speaker Bio:
Daniel Tunkelang is co-founder and Chief Scientist of Endeca, a provider
of enterprise information access solutions. He leads Endeca's efforts to
develop features and capabilities that emphasize user interaction and is
a leading industry advocate of dialog-oriented approaches to information
retrieval. He publishes The Noisy Channel (http://thenoisychannel.com/),
a blog about HCIR and related issues.
Yahoo/EECS HCI-IR Seminar Series, sponsored by Yahoo!
See other events happening in November 2008