论文标题

DSAC:用于风险敏感的增强学习

DSAC: Distributional Soft Actor-Critic for Risk-Sensitive Reinforcement Learning

论文作者

Ma, Xiaoteng, Chen, Junyao, Xia, Li, Yang, Jun, Zhao, Qianchuan, Zhou, Zhengyuan

论文摘要

我们提出了分布性软角色批判性(DSAC),这是一种分布强化学习(RL)算法,结合了累积奖励的分布信息和来自软角色批判(SAC)算法的熵驱动的探索的分布强度。 DSAC模拟动作和奖励的随机性,超过各种连续控制任务的基线表现。与仅最大化预期奖励的标准方法不同,我们提出了一个统一的风险敏感学习框架,该框架优化了与风险相关的目标,同时平衡熵以鼓励探索。广泛的实验表明,DSAC在增强风险中性和风险敏感的控制任务方面具有效力。

We present Distributional Soft Actor-Critic (DSAC), a distributional reinforcement learning (RL) algorithm that combines the strengths of distributional information of accumulated rewards and entropy-driven exploration from Soft Actor-Critic (SAC) algorithm. DSAC models the randomness in both action and rewards, surpassing baseline performances on various continuous control tasks. Unlike standard approaches that solely maximize expected rewards, we propose a unified framework for risk-sensitive learning, one that optimizes the risk-related objective while balancing entropy to encourage exploration. Extensive experiments demonstrate DSAC's effectiveness in enhancing agent performances for both risk-neutral and risk-sensitive control tasks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源