通过目标生成自主获得技能的基础语言

论文标题

通过目标生成自主获得技能的基础语言

Grounding Language to Autonomously-Acquired Skills via Goal Generation

论文作者

Akakzia, Ahmed, Colas, Cédric, Oudeyer, Pierre-Yves, Chetouani, Mohamed, Sigaud, Olivier

论文摘要

我们对自主获取技能的曲目感兴趣。语言条件的强化学习（LC-RL）方法是此任务中的出色工具，因为它们允许将抽象的目标表示为对国家的约束。但是，大多数LC-RL代理不是自主的，没有外部说明和反馈就无法学习。此外，他们的直接语言条件无法说明前言语婴儿的目标指导行为，并强烈限制给定语言输入的行为多样性的表达。为了解决这些问题，我们为语言条件的RL提出了一种新的概念方法：语言目标架构（LGB）。 LGB通过世界的中间语义表示，将技能学习和语言基础解散。为了展示LGB的属性，我们提出了一个称为DECSTR的特定实现。 DECSTR是一种内在动机的学习代理，并具有先天的语义表示，描述了物理对象之间的空间关系。在第一阶段（g-> b），它可以自由探索其环境并靶向自我生成的语义配置。在第二阶段（L-> g）中，它训练一个具有语言条件的目标生成器，以生成与基于语言的输入中表达的约束的语义目标。我们展示了LGB W.R.T.的其他属性端到端的LC-RL方法和类似的方法利用了非语义，连续的中间表示。中间语义表示有助于以多种方式满足语言命令，从而使策略在失败后进行切换并促进语言接地。

We are interested in the autonomous acquisition of repertoires of skills. Language-conditioned reinforcement learning (LC-RL) approaches are great tools in this quest, as they allow to express abstract goals as sets of constraints on the states. However, most LC-RL agents are not autonomous and cannot learn without external instructions and feedback. Besides, their direct language condition cannot account for the goal-directed behavior of pre-verbal infants and strongly limits the expression of behavioral diversity for a given language input. To resolve these issues, we propose a new conceptual approach to language-conditioned RL: the Language-Goal-Behavior architecture (LGB). LGB decouples skill learning and language grounding via an intermediate semantic representation of the world. To showcase the properties of LGB, we present a specific implementation called DECSTR. DECSTR is an intrinsically motivated learning agent endowed with an innate semantic representation describing spatial relations between physical objects. In a first stage (G -> B), it freely explores its environment and targets self-generated semantic configurations. In a second stage (L -> G), it trains a language-conditioned goal generator to generate semantic goals that match the constraints expressed in language-based inputs. We showcase the additional properties of LGB w.r.t. both an end-to-end LC-RL approach and a similar approach leveraging non-semantic, continuous intermediate representations. Intermediate semantic representations help satisfy language commands in a diversity of ways, enable strategy switching after a failure and facilitate language grounding.

下载PDF全文

下载文献需遵守相关版权规定

论文标题