HOF-3D：来自单个图像的整体3D线框感知

论文标题

HOF-3D：来自单个图像的整体3D线框感知

HoW-3D: Holistic 3D Wireframe Perception from a Single Image

论文作者

Ma, Wenchao, Tan, Bin, Xue, Nan, Wu, Tianfu, Zheng, Xianwei, Xia, Gui-Song

论文摘要

本文研究了整体3D线框感知的问题（HOW-3D），这是一项新的任务，即从单视图2D图像中感知可见的3D线框和无形的任务。由于无法在单个视图中直接观察到对象的非前面表面，因此估计了HOF-3D中的非线（NLOS）几何形状是一个根本上具有挑战性的问题，并且在计算机视觉中保持开放。我们通过提出一个ABC-HOW基准来研究HOF-3D的问题，该基准是在带有12K单视图像和相应的整体3D线框模型的CAD模型之上创建的。借助我们的大规模ABC-HOD-HOW-HOW-HOW-HOW-HOW-HOW-HOW-HOD BENCHMARC，我们提出了一种新型的深空间格式塔（DSG）模型，以学习可见的连接和线段作为基础，然后通过遵循人类视觉系统的格式塔原理来从可见的线索中推断出NLOS 3D结构。在我们的实验中，我们证明了我们的DSG模型在从单视图图像中推断整体3D线框方面表现出色。与强大的基线方法相比，我们的DSG模型在单视图像中检测不可见线的几何形状方面优于先前的线框探测器，甚至与先前的艺术相比，这些探测器对重建3D线框的输入而具有很高的竞争力。

This paper studies the problem of holistic 3D wireframe perception (HoW-3D), a new task of perceiving both the visible 3D wireframes and the invisible ones from single-view 2D images. As the non-front surfaces of an object cannot be directly observed in a single view, estimating the non-line-of-sight (NLOS) geometries in HoW-3D is a fundamentally challenging problem and remains open in computer vision. We study the problem of HoW-3D by proposing an ABC-HoW benchmark, which is created on top of CAD models sourced from the ABC-dataset with 12k single-view images and the corresponding holistic 3D wireframe models. With our large-scale ABC-HoW benchmark available, we present a novel Deep Spatial Gestalt (DSG) model to learn the visible junctions and line segments as the basis and then infer the NLOS 3D structures from the visible cues by following the Gestalt principles of human vision systems. In our experiments, we demonstrate that our DSG model performs very well in inferring the holistic 3D wireframes from single-view images. Compared with the strong baseline methods, our DSG model outperforms the previous wireframe detectors in detecting the invisible line geometry in single-view images and is even very competitive with prior arts that take high-fidelity PointCloud as inputs on reconstructing 3D wireframes.

下载PDF全文

下载文献需遵守相关版权规定

论文标题