论文标题
目标识别作为强化学习
Goal Recognition as Reinforcement Learning
论文作者
论文摘要
实现目标识别的大多数方法都取决于追求目标时演员可能动态的规格。这些规格遇到了两个关键问题。首先,编码这些动态需要域专家的仔细设计,这通常在识别时对噪声不强大。其次,现有方法通常需要昂贵的实时计算,以推理每个潜在目标的可能性。在本文中,我们开发了一个框架,该框架结合了无模型的强化学习和目标识别,以减轻仔细,手动域设计的需求,并需要昂贵的在线执行。该框架包括两个主要阶段:针对每个潜在目标的政策或实用程序功能的离线学习以及在线推断。我们在学习阶段使用表格Q学习提供了此框架的第一个实例,以及可用于执行推理阶段的三个措施。最终的实例化在标准评估域和嘈杂环境中的卓越性能上实现了最先进的性能。
Most approaches for goal recognition rely on specifications of the possible dynamics of the actor in the environment when pursuing a goal. These specifications suffer from two key issues. First, encoding these dynamics requires careful design by a domain expert, which is often not robust to noise at recognition time. Second, existing approaches often need costly real-time computations to reason about the likelihood of each potential goal. In this paper, we develop a framework that combines model-free reinforcement learning and goal recognition to alleviate the need for careful, manual domain design, and the need for costly online executions. This framework consists of two main stages: Offline learning of policies or utility functions for each potential goal, and online inference. We provide a first instance of this framework using tabular Q-learning for the learning stage, as well as three measures that can be used to perform the inference stage. The resulting instantiation achieves state-of-the-art performance against goal recognizers on standard evaluation domains and superior performance in noisy environments.