论文标题
评估质量多样性神经进化算法在硬探索问题中的性能
Assessing Quality-Diversity Neuro-Evolution Algorithms Performance in Hard Exploration Problems
论文作者
论文摘要
自然的一个有趣的方面在于它产生一系列生物体的能力,这些生物在其利基市场中都表现出色。质量多样性(QD)方法是受这一观察启发的进化算法,在从机翼设计到机器人适应的许多应用中获得了很多结果。最近,一些作品表明,这些方法可以应用于执行神经进化以解决大型搜索空间中的控制问题。在这样的问题中,多样性本身可以成为目标。多样性也可以是增强表现出欺骗性奖励信号的任务探索的一种方式。虽然在QD社区深入研究了第一方面,但文献中后者仍然稀少。探索是试图解决控制问题(例如增强学习和QD方法)的几个领域的核心,这是有望克服相关挑战的候选人。因此,我们认为,QD社区感兴趣的标准化基准在高维度中表现出控制问题。在本文中,我们重点介绍了三个候选基准,并解释了为什么它们似乎与QD算法的系统评估有关。我们还提供了JAX中的开源实现,使从业者可以快速运行几乎没有计算资源的实验。
A fascinating aspect of nature lies in its ability to produce a collection of organisms that are all high-performing in their niche. Quality-Diversity (QD) methods are evolutionary algorithms inspired by this observation, that obtained great results in many applications, from wing design to robot adaptation. Recently, several works demonstrated that these methods could be applied to perform neuro-evolution to solve control problems in large search spaces. In such problems, diversity can be a target in itself. Diversity can also be a way to enhance exploration in tasks exhibiting deceptive reward signals. While the first aspect has been studied in depth in the QD community, the latter remains scarcer in the literature. Exploration is at the heart of several domains trying to solve control problems such as Reinforcement Learning and QD methods are promising candidates to overcome the challenges associated. Therefore, we believe that standardized benchmarks exhibiting control problems in high dimension with exploration difficulties are of interest to the QD community. In this paper, we highlight three candidate benchmarks and explain why they appear relevant for systematic evaluation of QD algorithms. We also provide open-source implementations in Jax allowing practitioners to run fast and numerous experiments on few compute resources.