Learning Representation and Behavior: A Unified Approach to Credit Assignment
Speaker: Sridhar Mahadevan , University of Massachusetts at AmherstContact:
Date: October 18 2006
Time: 4:00PM to 5:00PM
Location: 32-G475A Stata Center - Gates 4th Floor Lounge
Host: Regina Barzilay, MIT CSAIL
Marcia Davidson, 617-253-3049, email@example.comRelevant URL:
From the earliest period of research in AI, the credit assignment problem has been recognized as a fundamental challenge. Beginning with Samuel's pioneering work in the 1950s, past work has largely focused on a divide-and-conquer approach of gluing solutions to the temporal credit assignment problem (e.g. TD learning) and spatial credit assignment problem (e.g. tuning RBF networks). This talk describes recent work on a unified approach to the credit assignment problem in which agents jointly learn representations and behavior, by solving a "heat" diffusion equation on the state (action) space manifold that looks remarkably similar to the traditional reward-based Bellman equation. The resulting learned representations vividly capture the large-scale irregular geometry of an environment, and can be viewed as "proto-value" functions or task-independent building blocks of reward-based value functions.
Two approaches to constructing proto-value functions will be compared. Fourier or Laplacian methods construct global long-time scale eigenvector bases by diagonalizing a random walk on the state (action) space manifold. The diffusion wavelet framework, in contrast, constructs local compact bases using dilations of random walk operators to model multi-scale spatial and temporal regularities. Novel algorithms for solving Markov decision processes will be outlined, including representation policy iteration which combines learning basis functions and control, and diffusion policy evaluation, which yields a compact way of representing powers of transition matrices. Ideas for scaling the proposed framework to large factored and continuous domains will be presented.
See other events that are part of Language, Learning, Vision and Graphics Seminar Series (LLVG) 2006/2007
See other events happening in October 2006