论文标题

基于流动的pomdps的基于流动的信念状态学习

Flow-based Recurrent Belief State Learning for POMDPs

论文作者

Chen, Xiaoyu, Mu, Yao, Luo, Ping, Li, Shengbo, Chen, Jianyu

论文摘要

部分可观察到的马尔可夫决策过程(POMDP)提供了一个原则上的通用框架,可以模拟现实世界顺序决策过程,但仍未解决,尤其是对于高维的连续空间和未知模型而言。主要挑战在于如何准确获得信仰状态,这是给定历史信息的不可观察环境的概率分布。准确地计算这种信念状态是获得POMDP的最佳政策的先决条件。深度学习技术的最新进展显示出学习良好信念状态的巨大潜力。但是,现有方法只能以有限的灵活性学习近似分布。在本文中,我们介绍了\ textbf {f} l \ textbf {o} w w textbf {r} ecurrent \ textbf {be} lief \ textbf {s} textbf {s} tate Model(forbes),该模型(forbes)将正常的流动纳入了一般性的推进信念中,以了解一般性的持续信仰,以了解一般的持续信仰。此外,我们表明可以将学习的信念状态插入下游RL算法以提高性能。在实验中,我们表明我们的方法成功地捕获了复杂的信念,即能够实现多模式预测以及高质量的重建,并导致挑战性视觉运动控制任务的结果表明,我们的方法可实现出色的性能和样本效率。

Partially Observable Markov Decision Process (POMDP) provides a principled and generic framework to model real world sequential decision making processes but yet remains unsolved, especially for high dimensional continuous space and unknown models. The main challenge lies in how to accurately obtain the belief state, which is the probability distribution over the unobservable environment states given historical information. Accurately calculating this belief state is a precondition for obtaining an optimal policy of POMDPs. Recent advances in deep learning techniques show great potential to learn good belief states. However, existing methods can only learn approximated distribution with limited flexibility. In this paper, we introduce the \textbf{F}l\textbf{O}w-based \textbf{R}ecurrent \textbf{BE}lief \textbf{S}tate model (FORBES), which incorporates normalizing flows into the variational inference to learn general continuous belief states for POMDPs. Furthermore, we show that the learned belief states can be plugged into downstream RL algorithms to improve performance. In experiments, we show that our methods successfully capture the complex belief states that enable multi-modal predictions as well as high quality reconstructions, and results on challenging visual-motor control tasks show that our method achieves superior performance and sample efficiency.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源