论文标题
通过黑盒验证算法的基于强化学习的自我改进安全性能
Self-Improving Safety Performance of Reinforcement Learning Based Driving with Black-Box Verification Algorithms
论文作者
论文摘要
在这项工作中,我们提出了一种自我改善的人工智能系统,以增强使用黑盒验证方法的基于强化学习(RL)基于基于的自主驾驶(AD)代理的安全性能。近年来,RL算法在广告应用中变得流行。但是,现有RL算法的性能在很大程度上取决于培训方案的多样性。在训练阶段缺乏安全 - 关键方案可能会导致在现实世界中驾驶应用中的泛化性能差。我们提出了一个新颖的框架,其中通过黑盒验证方法探索了训练集的弱点。在发现广告故障方案后,通过转移学习重新开设了RL代理的培训,以提高以前不安全的情况。仿真结果表明,我们的方法有效地发现了基于RL的自适应巡航控制(ACC)应用中的行动决策的安全失败,并通过我们方法的迭代应用大大减少了车辆碰撞的数量。源代码可在https://github.com/data-and-decision-lab/self-improving-rl上公开获得。
In this work, we propose a self-improving artificial intelligence system to enhance the safety performance of reinforcement learning (RL)-based autonomous driving (AD) agents using black-box verification methods. RL algorithms have become popular in AD applications in recent years. However, the performance of existing RL algorithms heavily depends on the diversity of training scenarios. A lack of safety-critical scenarios during the training phase could result in poor generalization performance in real-world driving applications. We propose a novel framework in which the weaknesses of the training set are explored through black-box verification methods. After discovering AD failure scenarios, the RL agent's training is re-initiated via transfer learning to improve the performance of previously unsafe scenarios. Simulation results demonstrate that our approach efficiently discovers safety failures of action decisions in RL-based adaptive cruise control (ACC) applications and significantly reduces the number of vehicle collisions through iterative applications of our method. The source code is publicly available at https://github.com/data-and-decision-lab/self-improving-RL.