强化学习概括以惊喜最小化

论文标题

强化学习概括以惊喜最小化

Reinforcement Learning Generalization with Surprise Minimization

论文作者

Chen, Jerry Zikun

论文摘要

对于深度强化学习算法而言，概括仍然是一个具有挑战性的问题，这些算法通常在同一一组确定性的游戏环境中接受训练和测试。当测试环境看不见且干扰时，但任务的性质保持不变时，可能会出现概括差距。在这项工作中，我们提出并评估了一种在概括基准上最小化代理的惊喜，以显示从简单密度模型中学到的额外奖励，可以在程序生成的游戏环境中显示出稳健性，从而提供持续的熵和随机性。

Generalization remains a challenging problem for deep reinforcement learning algorithms, which are often trained and tested on the same set of deterministic game environments. When test environments are unseen and perturbed but the nature of the task remains the same, generalization gaps can arise. In this work, we propose and evaluate a surprise minimizing agent on a generalization benchmark to show an additional reward learned from a simple density model can show robustness in procedurally generated game environments that provide constant source of entropy and stochasticity.

下载PDF全文

下载文献需遵守相关版权规定

论文标题