强大模仿的神奇基准

论文标题

强大模仿的神奇基准

The MAGICAL Benchmark for Robust Imitation

论文作者

Toyer, Sam, Shah, Rohin, Critch, Andrew, Russell, Stuart

论文摘要

通常在用于创建演示的相同环境中评估模仿学习（IL）算法。这种奖励在一个特定环境中的示范精确复制，但几乎没有提供有关算法如何将演示者的意图推广到实质上不同的部署设置的信息。本文介绍了神奇的基准套件，该套件可以通过量化对IL算法在实践中可能会遇到的不同种类的分布转移的鲁棒性来进行概括性评估。使用魔法套件，我们确认现有的IL算法显着适合提供演示的情况。我们还表明，减少过度拟合的标准方法可以有效地产生狭窄的感知不变，但不足以使其能够转移到需要实质上不同行为的环境中，这表明需要新的方法来稳健地概括示威者的意图。魔术套件的代码和数据可在https://github.com/qxcv/magical/上获得。

Imitation Learning (IL) algorithms are typically evaluated in the same environment that was used to create demonstrations. This rewards precise reproduction of demonstrations in one particular environment, but provides little information about how robustly an algorithm can generalise the demonstrator's intent to substantially different deployment settings. This paper presents the MAGICAL benchmark suite, which permits systematic evaluation of generalisation by quantifying robustness to different kinds of distribution shift that an IL algorithm is likely to encounter in practice. Using the MAGICAL suite, we confirm that existing IL algorithms overfit significantly to the context in which demonstrations are provided. We also show that standard methods for reducing overfitting are effective at creating narrow perceptual invariances, but are not sufficient to enable transfer to contexts that require substantially different behaviour, which suggests that new approaches will be needed in order to robustly generalise demonstrator intent. Code and data for the MAGICAL suite is available at https://github.com/qxcv/magical/.

下载PDF全文

下载文献需遵守相关版权规定

论文标题