从眼睛限制到国家建设：在线表示学习的诊断基准

论文标题

从眼睛限制到国家建设：在线表示学习的诊断基准

From Eye-blinks to State Construction: Diagnostic Benchmarks for Online Representation Learning

论文作者

Rafiee, Banafsheh, Abbas, Zaheer, Ghiassian, Sina, Kumaraswamy, Raksha, Sutton, Richard, Ludvig, Elliot, White, Adam

论文摘要

我们提出了三个新的诊断预测问题，灵感来自经典条件实验，以促进在线预测学习中的研究。经典条件的实验表明，诸如兔子，鸽子和狗等动物可以建立长时间的时间关联，以实现多步骤预测。为了复制这种非凡的能力，代理必须构建内部状态表示，以总结其相互作用历史记录。复发性神经网络可以自动构建状态并学习时间关联。但是，当前的培训方法对于在线预测而言是非常昂贵的 - 在每一个时间步骤中持续学习 - 这是本文的重点。我们提出的问题测试了动物很容易表现出来的学习能力，并强调了当前经常性学习方法的局限性。尽管提出的问题是非平凡的，但它们仍然可以在小型政权中进行广泛的测试和分析，从而使研究人员能够孤立地研究问题，最终加速了在可扩展的在线表示学习方法方面的进步。

We present three new diagnostic prediction problems inspired by classical-conditioning experiments to facilitate research in online prediction learning. Experiments in classical conditioning show that animals such as rabbits, pigeons, and dogs can make long temporal associations that enable multi-step prediction. To replicate this remarkable ability, an agent must construct an internal state representation that summarizes its interaction history. Recurrent neural networks can automatically construct state and learn temporal associations. However, the current training methods are prohibitively expensive for online prediction -- continual learning on every time step -- which is the focus of this paper. Our proposed problems test the learning capabilities that animals readily exhibit and highlight the limitations of the current recurrent learning methods. While the proposed problems are nontrivial, they are still amenable to extensive testing and analysis in the small-compute regime, thereby enabling researchers to study issues in isolation, ultimately accelerating progress towards scalable online representation learning methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题