论文标题

通过目标生成自主获得技能的基础语言

Grounding Language to Autonomously-Acquired Skills via Goal Generation

论文作者

Akakzia, Ahmed, Colas, Cédric, Oudeyer, Pierre-Yves, Chetouani, Mohamed, Sigaud, Olivier

论文摘要

我们对自主获取技能的曲目感兴趣。语言条件的强化学习(LC-RL)方法是此任务中的出色工具,因为它们允许将抽象的目标表示为对国家的约束。但是,大多数LC-RL代理不是自主的,没有外部说明和反馈就无法学习。此外,他们的直接语言条件无法说明前言语婴儿的目标指导行为,并强烈限制给定语言输入的行为多样性的表达。为了解决这些问题,我们为语言条件的RL提出了一种新的概念方法:语言目标架构(LGB)。 LGB通过世界的中间语义表示,将技能学习和语言基础解散。为了展示LGB的属性,我们提出了一个称为DECSTR的特定实现。 DECSTR是一种内在动机的学习代理,并具有先天的语义表示,描述了物理对象之间的空间关系。在第一阶段(g-> b),它可以自由探索其环境并靶向自我生成的语义配置。在第二阶段(L-> g)中,它训练一个具有语言条件的目标生成器,以生成与基于语言的输入中表达的约束的语义目标。我们展示了LGB W.R.T.的其他属性端到端的LC-RL方法和类似的方法利用了非语义,连续的中间表示。中间语义表示有助于以多种方式满足语言命令,从而使策略在失败后进行切换并促进语言接地。

We are interested in the autonomous acquisition of repertoires of skills. Language-conditioned reinforcement learning (LC-RL) approaches are great tools in this quest, as they allow to express abstract goals as sets of constraints on the states. However, most LC-RL agents are not autonomous and cannot learn without external instructions and feedback. Besides, their direct language condition cannot account for the goal-directed behavior of pre-verbal infants and strongly limits the expression of behavioral diversity for a given language input. To resolve these issues, we propose a new conceptual approach to language-conditioned RL: the Language-Goal-Behavior architecture (LGB). LGB decouples skill learning and language grounding via an intermediate semantic representation of the world. To showcase the properties of LGB, we present a specific implementation called DECSTR. DECSTR is an intrinsically motivated learning agent endowed with an innate semantic representation describing spatial relations between physical objects. In a first stage (G -> B), it freely explores its environment and targets self-generated semantic configurations. In a second stage (L -> G), it trains a language-conditioned goal generator to generate semantic goals that match the constraints expressed in language-based inputs. We showcase the additional properties of LGB w.r.t. both an end-to-end LC-RL approach and a similar approach leveraging non-semantic, continuous intermediate representations. Intermediate semantic representations help satisfy language commands in a diversity of ways, enable strategy switching after a failure and facilitate language grounding.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源