通过黑盒验证算法的基于强化学习的自我改进安全性能

论文标题

通过黑盒验证算法的基于强化学习的自我改进安全性能

Self-Improving Safety Performance of Reinforcement Learning Based Driving with Black-Box Verification Algorithms

论文作者

Dagdanov, Resul, Durmus, Halil, Ure, Nazim Kemal

论文摘要

在这项工作中，我们提出了一种自我改善的人工智能系统，以增强使用黑盒验证方法的基于强化学习（RL）基于基于的自主驾驶（AD）代理的安全性能。近年来，RL算法在广告应用中变得流行。但是，现有RL算法的性能在很大程度上取决于培训方案的多样性。在训练阶段缺乏安全 - 关键方案可能会导致在现实世界中驾驶应用中的泛化性能差。我们提出了一个新颖的框架，其中通过黑盒验证方法探索了训练集的弱点。在发现广告故障方案后，通过转移学习重新开设了RL代理的培训，以提高以前不安全的情况。仿真结果表明，我们的方法有效地发现了基于RL的自适应巡航控制（ACC）应用中的行动决策的安全失败，并通过我们方法的迭代应用大大减少了车辆碰撞的数量。源代码可在https://github.com/data-and-decision-lab/self-improving-rl上公开获得。

In this work, we propose a self-improving artificial intelligence system to enhance the safety performance of reinforcement learning (RL)-based autonomous driving (AD) agents using black-box verification methods. RL algorithms have become popular in AD applications in recent years. However, the performance of existing RL algorithms heavily depends on the diversity of training scenarios. A lack of safety-critical scenarios during the training phase could result in poor generalization performance in real-world driving applications. We propose a novel framework in which the weaknesses of the training set are explored through black-box verification methods. After discovering AD failure scenarios, the RL agent's training is re-initiated via transfer learning to improve the performance of previously unsafe scenarios. Simulation results demonstrate that our approach efficiently discovers safety failures of action decisions in RL-based adaptive cruise control (ACC) applications and significantly reduces the number of vehicle collisions through iterative applications of our method. The source code is publicly available at https://github.com/data-and-decision-lab/self-improving-RL.

下载PDF全文

下载文献需遵守相关版权规定

论文标题