熵风险受约束的软性策略优化

论文标题

熵风险受约束的软性策略优化

Entropic Risk Constrained Soft-Robust Policy Optimization

论文作者

Russel, Reazul Hasan, Behzadian, Bahram, Petrik, Marek

论文摘要

Having a perfect model to compute the optimal policy is often infeasible in reinforcement learning. It is important in high-stakes domains to quantify and manage risk induced by model uncertainties. Entropic risk measure is an exponential utility-based convex risk measure that satisfies many reasonable properties.在本文中，我们提出了一种熵风险限制的政策梯度和参与者批评算法，这些算法避免了模型不确定性的风险。 We demonstrate the usefulness of our algorithms on several problem domains.

Having a perfect model to compute the optimal policy is often infeasible in reinforcement learning. It is important in high-stakes domains to quantify and manage risk induced by model uncertainties. Entropic risk measure is an exponential utility-based convex risk measure that satisfies many reasonable properties. In this paper, we propose an entropic risk constrained policy gradient and actor-critic algorithms that are risk-averse to the model uncertainty. We demonstrate the usefulness of our algorithms on several problem domains.

下载PDF全文

下载文献需遵守相关版权规定

论文标题