学习具有动态周期矛盾的控制的跨域对应关系

论文标题

学习具有动态周期矛盾的控制的跨域对应关系

Learning Cross-Domain Correspondence for Control with Dynamics Cycle-Consistency

论文作者

Zhang, Qiang, Xiao, Tete, Efros, Alexei A., Pinto, Lerrel, Wang, Xiaolong

论文摘要

许多机器人问题的核心是跨领域学习通信的挑战。例如，模仿学习需要在人与机器人之间获得对应关系。 SIM到现实需要物理模拟器与现实世界之间的对应关系。转移学习需要不同机器人环境之间的对应关系。本文旨在学习各个领域（视觉与内部状态），物理参数（质量和摩擦）以及形态（四肢数量）的对应关系。重要的是，使用来自两个域中的未配对和随机收集的数据学习对应关系。我们提出\ textit {Dynamics Cycles}，该{Dynamics Cycles}使用周期矛盾的约束将两个域的动态机器人行为对齐。找到此通信后，我们可以将在一个域上训练的策略直接转移到另一个域，而无需在第二个域中进行任何其他微调。我们在模拟和实际机器人上进行了各种问题域进行实验。我们的框架能够将真实机器人臂的未校准单眼视频与模拟臂的动态状态轨迹保持一致，而无需配对数据。我们的结果的视频演示可在以下网址提供：https：//sjtuzq.github.io/cycle_dynamics.html。

At the heart of many robotics problems is the challenge of learning correspondences across domains. For instance, imitation learning requires obtaining correspondence between humans and robots; sim-to-real requires correspondence between physics simulators and the real world; transfer learning requires correspondences between different robotics environments. This paper aims to learn correspondence across domains differing in representation (vision vs. internal state), physics parameters (mass and friction), and morphology (number of limbs). Importantly, correspondences are learned using unpaired and randomly collected data from the two domains. We propose \textit{dynamics cycles} that align dynamic robot behavior across two domains using a cycle-consistency constraint. Once this correspondence is found, we can directly transfer the policy trained on one domain to the other, without needing any additional fine-tuning on the second domain. We perform experiments across a variety of problem domains, both in simulation and on real robot. Our framework is able to align uncalibrated monocular video of a real robot arm to dynamic state-action trajectories of a simulated arm without paired data. Video demonstrations of our results are available at: https://sjtuzq.github.io/cycle_dynamics.html .

下载PDF全文

下载文献需遵守相关版权规定

论文标题