论文标题
熵风险受约束的软性策略优化
Entropic Risk Constrained Soft-Robust Policy Optimization
论文作者
论文摘要
Having a perfect model to compute the optimal policy is often infeasible in reinforcement learning. It is important in high-stakes domains to quantify and manage risk induced by model uncertainties. Entropic risk measure is an exponential utility-based convex risk measure that satisfies many reasonable properties.在本文中,我们提出了一种熵风险限制的政策梯度和参与者批评算法,这些算法避免了模型不确定性的风险。 We demonstrate the usefulness of our algorithms on several problem domains.
Having a perfect model to compute the optimal policy is often infeasible in reinforcement learning. It is important in high-stakes domains to quantify and manage risk induced by model uncertainties. Entropic risk measure is an exponential utility-based convex risk measure that satisfies many reasonable properties. In this paper, we propose an entropic risk constrained policy gradient and actor-critic algorithms that are risk-averse to the model uncertainty. We demonstrate the usefulness of our algorithms on several problem domains.