Private Event

Doctoral Thesis Defense: Learning and Control for Interactions in Mixed Human-Robot Environments

Speaker

Wilko Schwarting
CSAIL MIT

Host

Daniela Rus
CSAIL MIT
Abstract:
Society will receive tremendous benefits from the responsible use of robot capabilities that enable and support people. Robots will be part of our daily lives and will have to achieve complex interactive behaviors to seamlessly work with and around us. These intelligent agents will learn how to reason about human behavior and people's intentions. Furthermore, they will not only predict others' intentions, but also implicitly communicate their own intentions through human-like actions that can be understood by people. They will anticipate and leverage the effect of their actions on the actions of others in the environment. When own interests and the interests of others are not aligned, robots will be able to quantify people's willingness to cooperate or defect and negotiate accordingly through social behavior.
Robots will form beliefs from perceiving the world and others influenced by uncertainty from perception, prediction, or the environment in general. They will plan to actively gather information about themselves, others, and the environment, while simultaneously avoiding actions that lead to high uncertainty in scenarios where this is undesirable. Robots will not only reason about their own beliefs about the world but also the beliefs of others. Thus, they can leverage how their actions influence the beliefs of others.

Towards this future, this thesis contributes algorithms that enable (i) Social Human-Robot Interactions, agents that leverage (ii) Learning from Competition, and (iii) a Guardian system to keep people safe. For (i) Social Human-Robot Interactions, we learn human reward functions from driving data and formulate interactions between agents as a best-response game wherein each agent negotiates to maximize their utility. We study measuring Social Value Orientation (SVO) to quantify an agent’s degree of selfishness or altruism to allow us to better predict driver behavior. In stochastic environments with partial observations, we additionally enable agents to leverage information gain and reasoning about the beliefs of others by combining game-theoretic and belief-space planning. We present a multi-agent reinforcement learning algorithm that learns competitive visual control policies through self-play in imagination. The agent (ii) Learns from Competition by imagining multi-agent interaction sequences in the compact latent space of a learned world model that combines a joint transition function with opponent viewpoint prediction. Lastly, we introduce Parallel Autonomy, a (iii) Guardian system that provides safety in complex driving environments while following people's desired actions as close as safely possible.​

Thesis Committee:
Prof. Daniela Rus, Supervisor
Prof. Leslie Kaelbling, Prof. Sertac Karaman