克服语言引导目标条件的强化学习中的参考歧义

论文标题

克服语言引导目标条件的强化学习中的参考歧义

Overcoming Referential Ambiguity in Language-Guided Goal-Conditioned Reinforcement Learning

论文作者

Caselles-Dupré, Hugo, Sigaud, Olivier, Chetouani, Mohamed

论文摘要

解释中的歧义教授使用自然语言执行新任务可以很容易地阻碍。当老师通过参考对象的特征向学习者提供有关对象的指导时，学习者可以误解教师的意图，例如，如果指令模棱两可地指对象的特征，则是一种称为参考歧义的现象。我们研究了从认知科学中得出的两个概念如何帮助解决这些参考歧义：教育学（选择正确的说明）和实用主义（使用归纳推理学习其他代理的偏好）。我们将这些想法应用于教师/学习者的设置，并在模拟机器人任务（堆积）上使用两个人工代理。我们表明，这些概念提高了培训学习者的样本效率。

Teaching an agent to perform new tasks using natural language can easily be hindered by ambiguities in interpretation. When a teacher provides an instruction to a learner about an object by referring to its features, the learner can misunderstand the teacher's intentions, for instance if the instruction ambiguously refer to features of the object, a phenomenon called referential ambiguity. We study how two concepts derived from cognitive sciences can help resolve those referential ambiguities: pedagogy (selecting the right instructions) and pragmatism (learning the preferences of the other agents using inductive reasoning). We apply those ideas to a teacher/learner setup with two artificial agents on a simulated robotic task (block-stacking). We show that these concepts improve sample efficiency for training the learner.

下载PDF全文

下载文献需遵守相关版权规定

论文标题