贝叶斯逆增强学习集体动物运动

论文标题

贝叶斯逆增强学习集体动物运动

Bayesian Inverse Reinforcement Learning for Collective Animal Movement

论文作者

Schafer, Toryn L. J., Wikle, Christopher K., Hooten, Mevin B.

论文摘要

基于代理的方法允许定义简单的规则来生成复杂的组行为。通常设置此类模型的管理规则，并根据观察到的行为轨迹调节参数。逆强化学习没有在所有预期的方案中简化假设，而是通过使用马尔可夫决策过程的属性来推断有关长期行为策略的短期（本地）规则。我们使用计算有效的线性溶解的马尔可夫决策过程来了解仿真自我推进粒子（SPP）模型的集体运动的本地规则，并为圈养的孔雀鱼类种群进行数据应用程序。行为决策成本的估计是在具有基础功能平滑的贝叶斯框架中完成的。我们在SPP模拟中恢复了真实成本，并发现孔雀鱼价值集体运动比朝着庇护所的目标运动更重要。

Agent-based methods allow for defining simple rules that generate complex group behaviors. The governing rules of such models are typically set a priori and parameters are tuned from observed behavior trajectories. Instead of making simplifying assumptions across all anticipated scenarios, inverse reinforcement learning provides inference on the short-term (local) rules governing long term behavior policies by using properties of a Markov decision process. We use the computationally efficient linearly-solvable Markov decision process to learn the local rules governing collective movement for a simulation of the self propelled-particle (SPP) model and a data application for a captive guppy population. The estimation of the behavioral decision costs is done in a Bayesian framework with basis function smoothing. We recover the true costs in the SPP simulation and find the guppies value collective movement more than targeted movement toward shelter.

下载PDF全文

下载文献需遵守相关版权规定

论文标题