酷MC：增强学习和模型检查的综合工具

论文标题

酷MC：增强学习和模型检查的综合工具

COOL-MC: A Comprehensive Tool for Reinforcement Learning and Model Checking

论文作者

Gross, Dennis, Jansen, Nils, Junges, Sebastian, Perez, Guillermo A.

论文摘要

本文介绍了Cool-MC，这是一种集成了最先进的加固学习（RL）和模型检查的工具。具体而言，该工具建立在OpenAI体育馆和概率模型Checker Storm上。 COOL-MC提供以下功能：（1）模拟器在OpenAI体育馆训练RL策略，用于马尔可夫决策过程（MDPS），这些策略定义为暴风雨的输入，（2）一种新的模型构建器，用于风暴的新模型建造者，该模型使用回调功能来验证（神经网络）RL策略，（3）与Open Andistors和OpenS On temers and artim and artim and artim and the Operip and the Openip and the Opens（4），并在Open-Poltiss（4）中（4），4）所谓的允许政策的执行。我们描述了Cool-MC的组件和体系结构，并在多个基准环境中演示了其功能。

This paper presents COOL-MC, a tool that integrates state-of-the-art reinforcement learning (RL) and model checking. Specifically, the tool builds upon the OpenAI gym and the probabilistic model checker Storm. COOL-MC provides the following features: (1) a simulator to train RL policies in the OpenAI gym for Markov decision processes (MDPs) that are defined as input for Storm, (2) a new model builder for Storm, which uses callback functions to verify (neural network) RL policies, (3) formal abstractions that relate models and policies specified in OpenAI gym or Storm, and (4) algorithms to obtain bounds on the performance of so-called permissive policies. We describe the components and architecture of COOL-MC and demonstrate its features on multiple benchmark environments.

下载PDF全文

下载文献需遵守相关版权规定

论文标题