论文标题
DSAC:用于风险敏感的增强学习
DSAC: Distributional Soft Actor-Critic for Risk-Sensitive Reinforcement Learning
论文作者
论文摘要
我们提出了分布性软角色批判性(DSAC),这是一种分布强化学习(RL)算法,结合了累积奖励的分布信息和来自软角色批判(SAC)算法的熵驱动的探索的分布强度。 DSAC模拟动作和奖励的随机性,超过各种连续控制任务的基线表现。与仅最大化预期奖励的标准方法不同,我们提出了一个统一的风险敏感学习框架,该框架优化了与风险相关的目标,同时平衡熵以鼓励探索。广泛的实验表明,DSAC在增强风险中性和风险敏感的控制任务方面具有效力。
We present Distributional Soft Actor-Critic (DSAC), a distributional reinforcement learning (RL) algorithm that combines the strengths of distributional information of accumulated rewards and entropy-driven exploration from Soft Actor-Critic (SAC) algorithm. DSAC models the randomness in both action and rewards, surpassing baseline performances on various continuous control tasks. Unlike standard approaches that solely maximize expected rewards, we propose a unified framework for risk-sensitive learning, one that optimizes the risk-related objective while balancing entropy to encourage exploration. Extensive experiments demonstrate DSAC's effectiveness in enhancing agent performances for both risk-neutral and risk-sensitive control tasks.