论文标题

熵风险受约束的软性策略优化

Entropic Risk Constrained Soft-Robust Policy Optimization

论文作者

Russel, Reazul Hasan, Behzadian, Bahram, Petrik, Marek

论文摘要

Having a perfect model to compute the optimal policy is often infeasible in reinforcement learning. It is important in high-stakes domains to quantify and manage risk induced by model uncertainties. Entropic risk measure is an exponential utility-based convex risk measure that satisfies many reasonable properties.在本文中,我们提出了一种熵风险限制的政策梯度和参与者批评算法,这些算法避免了模型不确定性的风险。 We demonstrate the usefulness of our algorithms on several problem domains.

Having a perfect model to compute the optimal policy is often infeasible in reinforcement learning. It is important in high-stakes domains to quantify and manage risk induced by model uncertainties. Entropic risk measure is an exponential utility-based convex risk measure that satisfies many reasonable properties. In this paper, we propose an entropic risk constrained policy gradient and actor-critic algorithms that are risk-averse to the model uncertainty. We demonstrate the usefulness of our algorithms on several problem domains.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源