论文标题

使用图像增强来学习密集的视觉描述符,以进行机器人操纵任务

Learning Dense Visual Descriptors using Image Augmentations for Robot Manipulation Tasks

论文作者

Graf, Christian, Adrian, David B., Weil, Joshua, Gabriel, Miroslav, Schillinger, Philipp, Spies, Markus, Neumann, Heiko, Kupcsik, Andras

论文摘要

我们提出了一种使用图像增强的自我监管的训练方法,用于学习视图的视觉描述符。与通常需要复杂数据集的现有作品(例如注册的RGBD序列)不同,我们在无序的一组RGB图像上训练。这允许从单个相机视图(例如,在具有修复式摄像机的现有机器人单元格中学习)学习。我们使用数据增强创建合成视图和密集的像素对应关系。尽管数据记录和设置要求更简单,但我们发现我们的描述符与现有方法具有竞争力。我们表明,对合成对应的培训提供了各种相机视图的描述符的一致性。我们将与多种视图的几何对应关系与培训进行比较,并提供消融研究。我们还使用从安装式摄像机中学到的描述符显示了一个机器人箱进行挑选实验,以定义掌握偏好。

We propose a self-supervised training approach for learning view-invariant dense visual descriptors using image augmentations. Unlike existing works, which often require complex datasets, such as registered RGBD sequences, we train on an unordered set of RGB images. This allows for learning from a single camera view, e.g., in an existing robotic cell with a fix-mounted camera. We create synthetic views and dense pixel correspondences using data augmentations. We find our descriptors are competitive to the existing methods, despite the simpler data recording and setup requirements. We show that training on synthetic correspondences provides descriptor consistency across a broad range of camera views. We compare against training with geometric correspondence from multiple views and provide ablation studies. We also show a robotic bin-picking experiment using descriptors learned from a fix-mounted camera for defining grasp preferences.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源