论文标题
学习通过梯度领域和对比损失的数据有效深度估计的几何表示
Learning a Geometric Representation for Data-Efficient Depth Estimation via Gradient Field and Contrastive Loss
论文作者
论文摘要
已广泛研究了来自单个RGB图像的深度图,以进行定位,映射和3维对象检测。关于单视深度估计的最新研究主要基于深度卷积神经网络(Convnets),该卷积神经网络(Convnets)需要大量的训练数据与密集注释的标签配对。深度注释任务既昂贵又效率低下,因此不可避免地利用RGB图像可以很容易地收集,以提高没有深度标签的Convnets的性能。但是,大多数自学学习算法都集中在捕获图像的语义信息上,以提高分类或对象检测的性能,而不是深入估计。在本文中,我们表明,现有的自我监督方法在深度估计上表现不佳,并提出了一种基于梯度的自我监督学习算法,并具有动量对比损失,以帮助通过未标记的图像提取几何信息。结果,网络可以使用相对少量的带注释数据来准确估算深度图。为了证明我们的方法与模型结构无关,我们使用两种不同的单眼深度估计算法评估了我们的方法。我们的方法的表现优于先前的最先进的自我监督学习算法,并显示了与NYU深度V2数据集的随机初始化相比,标记数据的效率。
Estimating a depth map from a single RGB image has been investigated widely for localization, mapping, and 3-dimensional object detection. Recent studies on a single-view depth estimation are mostly based on deep Convolutional neural Networks (ConvNets) which require a large amount of training data paired with densely annotated labels. Depth annotation tasks are both expensive and inefficient, so it is inevitable to leverage RGB images which can be collected very easily to boost the performance of ConvNets without depth labels. However, most self-supervised learning algorithms are focused on capturing the semantic information of images to improve the performance in classification or object detection, not in depth estimation. In this paper, we show that existing self-supervised methods do not perform well on depth estimation and propose a gradient-based self-supervised learning algorithm with momentum contrastive loss to help ConvNets extract the geometric information with unlabeled images. As a result, the network can estimate the depth map accurately with a relatively small amount of annotated data. To show that our method is independent of the model structure, we evaluate our method with two different monocular depth estimation algorithms. Our method outperforms the previous state-of-the-art self-supervised learning algorithms and shows the efficiency of labeled data in triple compared to random initialization on the NYU Depth v2 dataset.