Single-shot policy explanation to improve task performance via semantic reward coaching Journal Article uri icon

Overview

abstract

  • Abstract; Communication is crucial for synchronizing expectations and knowledge within teams. For robots to effectively collaborate with or provide actionable decision-support or coaching to humans, it is critical that they be able to generate intelligible explanations to reconcile differences between their understanding of the world and that of their collaborators. In this work we present Single-shot Policy Elicitation for Augmenting Rewards (SPEAR), a novel sequential optimization algorithm that uses semantic explanations derived from combinations of planning predicates to augment human agents’ reward functions, driving their policies to exhibit more optimal behavior by modeling humans as reinforcement learning (RL) agents and reconciling disparities in their reward function. We present an experimental validation of the policy manipulation capabilities of SPEAR in a practically grounded application and a performance analysis of SPEAR across a suite of domains with increasingly complex state spaces and predicate counts. SPEAR demonstrates substantial improvements in runtime and addressable problem size, enabling an expert agent to leverage its own expertise to communicate actionable information to improve human performance. Through a series of human subjects studies, we demonstrate SPEAR’s potential to improve human policies and reduce cognitive load, all while enhancing interpretability, task awareness, and promoting active thinking patterns among users. Finally, we apply SPEAR in a robot-to-robot policy manipulation scenario, showcasing its applicability in robot-robot collaborations.

publication date

  • September 1, 2025

Date in CU Experts

  • January 12, 2026 1:08 AM

Full Author List

  • Tabrez A; Leonard R; Hayes B

author count

  • 3

Other Profiles

International Standard Serial Number (ISSN)

  • 0941-0643

Electronic International Standard Serial Number (EISSN)

  • 1433-3058

Additional Document Info

start page

  • 22315

end page

  • 22337

volume

  • 37

issue

  • 26