学习从单个图像中恢复3D场景形状

论文标题

学习从单个图像中恢复3D场景形状

Learning to Recover 3D Scene Shape from a Single Image

论文作者

Yin, Wei, Zhang, Jianming, Wang, Oliver, Niklaus, Simon, Mai, Long, Chen, Simon, Shen, Chunhua

论文摘要

尽管在野生野生中的单眼深度估计取得了重大进展，但最近的最新方法不能用于恢复准确的3D场景形状，这是由于在混合数据深度深度预测训练中使用的不变的重建损失引起的未知深度偏移以及可能的未知相机焦距。我们详细研究了这个问题，并提出了一个两阶段的框架，该框架首先将深度预测到未知量表并从单一单眼图像转移，然后使用3D点云编码器来预测缺失的深度偏移和焦距，使我们能够恢复逼真的3D场景形状。此外，我们提出了图像级的归一化回归损失和基于正常的几何损失，以增强在混合数据集上训练的深度预测模型。我们在九个看不见的数据集上测试我们的深度模型，并在零弹数据集概括上实现最先进的性能。代码可在以下网址找到：https：//git.io/depth

Despite significant progress in monocular depth estimation in the wild, recent state-of-the-art methods cannot be used to recover accurate 3D scene shape due to an unknown depth shift induced by shift-invariant reconstruction losses used in mixed-data depth prediction training, and possible unknown camera focal length. We investigate this problem in detail, and propose a two-stage framework that first predicts depth up to an unknown scale and shift from a single monocular image, and then use 3D point cloud encoders to predict the missing depth shift and focal length that allow us to recover a realistic 3D scene shape. In addition, we propose an image-level normalized regression loss and a normal-based geometry loss to enhance depth prediction models trained on mixed datasets. We test our depth model on nine unseen datasets and achieve state-of-the-art performance on zero-shot dataset generalization. Code is available at: https://git.io/Depth

下载PDF全文

下载文献需遵守相关版权规定

论文标题