Doctoral Thesis Defnse: On the Learnability of General Reinforcement-learning Objectives

Speaker

Cambridge Yang
CSAIL

Presenter: Cambridge Yang

Thesis Supervisor: Michael J. Carbin

Abstract:
```
Reinforcement learning enables agents to learn optimal behaviors in unknown environments to achieve specified objectives. Traditionally, these objectives are expressed as rewards, with established algorithms guaranteeing the learning of policies that maximize them. However, reward-based objectives often act as imperfect surrogates for true objectives, leading to problems such as reward hacking, where agents exploit the reward signal without achieving the intended goal. This limitation underscores the need for more expressive frameworks to specify and learn general reinforcement-learning objectives.

This thesis addresses the specification and learnability of general reinforcement-learning objectives, tackling key challenges across three progressively complex classes: Linear Temporal Logic (LTL) objectives, computable objectives, and non-computable objectives. For LTL objectives, I show that their learnability depends critically on whether they are finitary. For computable objectives, I establish sufficient conditions for PAC-learnability based on continuity and computability, providing a systematic approach to assess and design learnable objectives. Finally, I introduce a framework for specifying and analyzing the learnability of non-computable objectives, proposing a relaxed notion of learnability that extends guarantees to this broader class.

By unifying the study of general reinforcement-learning objectives, this work offers a comprehensive framework for specifying, analyzing, and learning policies beyond the reward-based paradigm. This thesis lays a theoretical foundation for designing intelligent agents capable of achieving complex, well-specified goals.