通过主动推断学习的强化学习

论文标题

通过主动推断学习的强化学习

Reinforcement Learning through Active Inference

论文作者

Tschantz, Alexander, Millidge, Beren, Seth, Anil K., Buckley, Christopher L.

论文摘要

增强学习（RL）的核心宗旨是，代理商试图最大程度地提高累积奖励之和。相比之下，主动推断是认知和计算神经科学中新兴框架，提出代理作用以最大程度地提高偏见的生成模型的证据。在这里，我们说明了来自主动推理的想法如何通过（i）提供固有的探索和剥削平衡来增强传统的RL方法，以及（ii）提供更灵活的奖励概念化。受主动推论的启发，我们为决策制定并实施了一个新的目标，我们认为这是预期未来的自由能量。我们证明，由此产生的算法成功地平衡了探索和剥削，同时在几种具有稀疏，形状良好且没有奖励的挑战性RL基准上实现了强劲的性能。

The central tenet of reinforcement learning (RL) is that agents seek to maximize the sum of cumulative rewards. In contrast, active inference, an emerging framework within cognitive and computational neuroscience, proposes that agents act to maximize the evidence for a biased generative model. Here, we illustrate how ideas from active inference can augment traditional RL approaches by (i) furnishing an inherent balance of exploration and exploitation, and (ii) providing a more flexible conceptualization of reward. Inspired by active inference, we develop and implement a novel objective for decision making, which we term the free energy of the expected future. We demonstrate that the resulting algorithm successfully balances exploration and exploitation, simultaneously achieving robust performance on several challenging RL benchmarks with sparse, well-shaped, and no rewards.

下载PDF全文

下载文献需遵守相关版权规定

论文标题