论文标题

基于模型的强化学习,用于具有神经odes的半马尔可夫决策过程

Model-based Reinforcement Learning for Semi-Markov Decision Processes with Neural ODEs

论文作者

Du, Jianzhun, Futoma, Joseph, Doshi-Velez, Finale

论文摘要

我们使用神经普通微分方程(ODES)在一个新型的基于模型的强化学习(RL)框架(SMDP)中提出了两个用于建模连续时间动力学的优雅解决方案。我们的模型可以准确地表征连续的时间动力学,并使我们能够使用少量数据制定高性能的策略。我们还开发了一种基于模型的方法来优化时间表,以降低与环境的交互率,同时保持近乎最佳的性能,这对于无模型方法是不可能的。我们通过实验证明了我们在各种连续时间域中方法的功效。

We present two elegant solutions for modeling continuous-time dynamics, in a novel model-based reinforcement learning (RL) framework for semi-Markov decision processes (SMDPs), using neural ordinary differential equations (ODEs). Our models accurately characterize continuous-time dynamics and enable us to develop high-performing policies using a small amount of data. We also develop a model-based approach for optimizing time schedules to reduce interaction rates with the environment while maintaining the near-optimal performance, which is not possible for model-free methods. We experimentally demonstrate the efficacy of our methods across various continuous-time domains.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源