DSAC：用于风险敏感的增强学习

论文标题

DSAC：用于风险敏感的增强学习

DSAC: Distributional Soft Actor-Critic for Risk-Sensitive Reinforcement Learning

论文作者

Ma, Xiaoteng, Chen, Junyao, Xia, Li, Yang, Jun, Zhao, Qianchuan, Zhou, Zhengyuan

论文摘要

我们提出了分布性软角色批判性（DSAC），这是一种分布强化学习（RL）算法，结合了累积奖励的分布信息和来自软角色批判（SAC）算法的熵驱动的探索的分布强度。 DSAC模拟动作和奖励的随机性，超过各种连续控制任务的基线表现。与仅最大化预期奖励的标准方法不同，我们提出了一个统一的风险敏感学习框架，该框架优化了与风险相关的目标，同时平衡熵以鼓励探索。广泛的实验表明，DSAC在增强风险中性和风险敏感的控制任务方面具有效力。

We present Distributional Soft Actor-Critic (DSAC), a distributional reinforcement learning (RL) algorithm that combines the strengths of distributional information of accumulated rewards and entropy-driven exploration from Soft Actor-Critic (SAC) algorithm. DSAC models the randomness in both action and rewards, surpassing baseline performances on various continuous control tasks. Unlike standard approaches that solely maximize expected rewards, we propose a unified framework for risk-sensitive learning, one that optimizes the risk-related objective while balancing entropy to encourage exploration. Extensive experiments demonstrate DSAC's effectiveness in enhancing agent performances for both risk-neutral and risk-sensitive control tasks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题