论文标题
迈向3D重建的概率融合到标准管道中
Towards the Probabilistic Fusion of Learned Priors into Standard Pipelines for 3D Reconstruction
论文作者
论文摘要
将深度学习结果与标准3D重建管道结合结果的最佳方法仍然是一个开放的问题。尽管当前将传统的多视角立体声方法输出到正规化或改进网络的系统似乎可以获得最佳结果,但最好将深层神经网络视为单独的组件,其结果可以将其结果概率地融合到基于几何的系统中。不幸的是,进行此类融合所需的错误模型尚不清楚,并提出了许多不同的方法。最近,一些系统通过使他们的网络预测概率分布而不是单个值来取得良好的结果。我们建议使用这种方法将学习的单视深度融合到标准的3D重建系统中。 我们的系统能够为一组密钥帧逐步生成密集的深度图。我们训练一个深度的神经网络,以预测单个图像中每个像素深度的离散,非参数概率分布。然后,我们根据后续帧和密钥帧图像之间的光度一致性将此“概率卷”与另一个概率卷融合在一起。我们认为,将这两个来源的概率量结合起来将导致一个更好的条件。为了从体积中提取深度图,我们最大程度地减少了一个成本函数,该成本函数包括基于网络预测的表面正常和遮挡边界的正则化项。通过一系列实验,我们证明了这些组件中的每一个都改善了系统的整体性能。
The best way to combine the results of deep learning with standard 3D reconstruction pipelines remains an open problem. While systems that pass the output of traditional multi-view stereo approaches to a network for regularisation or refinement currently seem to get the best results, it may be preferable to treat deep neural networks as separate components whose results can be probabilistically fused into geometry-based systems. Unfortunately, the error models required to do this type of fusion are not well understood, with many different approaches being put forward. Recently, a few systems have achieved good results by having their networks predict probability distributions rather than single values. We propose using this approach to fuse a learned single-view depth prior into a standard 3D reconstruction system. Our system is capable of incrementally producing dense depth maps for a set of keyframes. We train a deep neural network to predict discrete, nonparametric probability distributions for the depth of each pixel from a single image. We then fuse this "probability volume" with another probability volume based on the photometric consistency between subsequent frames and the keyframe image. We argue that combining the probability volumes from these two sources will result in a volume that is better conditioned. To extract depth maps from the volume, we minimise a cost function that includes a regularisation term based on network predicted surface normals and occlusion boundaries. Through a series of experiments, we demonstrate that each of these components improves the overall performance of the system.