论文标题
从机器人动作的自我监督对象中的分割
Self-Supervised Object-in-Gripper Segmentation from Robotic Motions
论文作者
论文摘要
在机器人操作的背景下,准确的对象分割是至关重要的任务。但是,为神经网络创建足够的注释培训数据特别耗时,通常需要手动标记。为此,我们提出了一种简单但坚固的解决方案,用于学习分割机器人抓住的未知对象。具体而言,我们在RGB视频序列中利用运动和时间提示。使用光流估计,我们首先学会预测给定操作器的分割掩模。然后,将这些注释与运动提示结合使用,以自动区分背景,操纵器和未知的,握住的对象。与现有系统相反,我们的方法是完全自制的,并且独立于精确的相机校准,3D模型或潜在的不完美深度数据。我们与文献的替代基线和方法进行了详尽的比较。对象蒙版和视图被证明是对新型环境概括的分割网络的合适培训数据,还可以进行水密3D重建。
Accurate object segmentation is a crucial task in the context of robotic manipulation. However, creating sufficient annotated training data for neural networks is particularly time consuming and often requires manual labeling. To this end, we propose a simple, yet robust solution for learning to segment unknown objects grasped by a robot. Specifically, we exploit motion and temporal cues in RGB video sequences. Using optical flow estimation we first learn to predict segmentation masks of our given manipulator. Then, these annotations are used in combination with motion cues to automatically distinguish between background, manipulator and unknown, grasped object. In contrast to existing systems our approach is fully self-supervised and independent of precise camera calibration, 3D models or potentially imperfect depth data. We perform a thorough comparison with alternative baselines and approaches from literature. The object masks and views are shown to be suitable training data for segmentation networks that generalize to novel environments and also allow for watertight 3D reconstruction.