时间逻辑模仿：从演示中学习计划 - 诱使运动政策

论文标题

时间逻辑模仿：从演示中学习计划 - 诱使运动政策

Temporal Logic Imitation: Learning Plan-Satisficing Motion Policies from Demonstrations

论文作者

Wang, Yanwei, Figueroa, Nadia, Li, Shen, Shah, Ankit, Shah, Julie

论文摘要

从演示中学习（LFD）已成功完成了长期范围的任务。但是，当问题复杂性还包括人类的扰动时，最新的方法并不能保证成功复制任务。在这项工作中，我们将这一挑战的根源确定为学习的持续政策无法满足演示中隐含的离散计划的失败。通过利用模式（而不是子目标）作为具有模式不变性和目标达到性能属性的离散抽象和运动策略，我们证明我们所学到的连续策略可以模拟由线性时间逻辑（LTL）公式指定的任何离散计划。因此，模仿者对任务和运动级的扰动都具有鲁棒性，并保证获得任务成功。项目页面：https：//yanweiw.github.io/tli/

Learning from demonstration (LfD) has succeeded in tasks featuring a long time horizon. However, when the problem complexity also includes human-in-the-loop perturbations, state-of-the-art approaches do not guarantee the successful reproduction of a task. In this work, we identify the roots of this challenge as the failure of a learned continuous policy to satisfy the discrete plan implicit in the demonstration. By utilizing modes (rather than subgoals) as the discrete abstraction and motion policies with both mode invariance and goal reachability properties, we prove our learned continuous policy can simulate any discrete plan specified by a linear temporal logic (LTL) formula. Consequently, an imitator is robust to both task- and motion-level perturbations and guaranteed to achieve task success. Project page: https://yanweiw.github.io/tli/

下载PDF全文

下载文献需遵守相关版权规定

论文标题