使用基于案例的推理在OpenAI中克隆行为克隆

论文标题

使用基于案例的推理在OpenAI中克隆行为克隆

Behavior Cloning in OpenAI using Case Based Reasoning

论文作者

Peters, Chad, Esfandiari, Babak, Zalat, Mohamad, West, Robert

论文摘要

通过观察（LFO）学习（LFO），也称为行为克隆，是通过记录专家（人为或人造）的行为并使用记录的数据来生成所需行为来构建软件代理的一种方法。 Jloaf是一个使用基于病例的推理来实现LFO的平台。在本文中，我们与流行的OpenAI健身房环境接口。我们的实验结果表明，如何使用我们的方法为该领域中的比较提供基线，并在处理环境复杂性时确定优势和缺点。

Learning from Observation (LfO), also known as Behavioral Cloning, is an approach for building software agents by recording the behavior of an expert (human or artificial) and using the recorded data to generate the required behavior. jLOAF is a platform that uses Case-Based Reasoning to achieve LfO. In this paper we interface jLOAF with the popular OpenAI Gym environment. Our experimental results show how our approach can be used to provide a baseline for comparison in this domain, as well as identify the strengths and weaknesses when dealing with environmental complexity.

下载PDF全文

下载文献需遵守相关版权规定

论文标题