论文标题
3D形状重建和完成的特征空间中的隐式功能
Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion
论文作者
论文摘要
尽管许多作品着重于图像中的3D重建,但在本文中,我们重点介绍了各种3D输入的3D形状重建和完成,这些输入在某些方面不足:低分辨率和高分辨率体素,稀疏和致密点云,完整或不完整。这样的3D输入的处理是一个越来越重要的问题,因为它们是3D扫描仪的输出,它们变得越来越易于访问,并且是3D计算机视觉算法的中间输出。最近,学到的隐式功能在产生连续重建时表现出了巨大的希望。但是,我们从3D输入中确定了重建的两个局限性:1)输入数据中存在的细节未保留,并且2)铰接的人的重建不良。为了解决这个问题,我们提出了隐式特征网络(IF-NETS),即传递连续的输出,可以处理多个拓扑,并为缺少或稀疏输入数据保留了最近学习的隐式功能的完整形状,但是当它存在于输入数据中并可以重建详细信息时,它们也可以保留细节,并可以保留细节。我们的工作在两个关键方面与先前的工作不同。首先,我们没有使用单个矢量来编码3D形状,而是提取了深度特征的可学习的3维多尺度张量,该张量与原始的欧几里得空间对齐。其次,我们没有直接对X-Y-Z点坐标进行分类,而是对从连续查询点提取的深度特征进行了分类。我们表明,这迫使我们的模型基于全球和局部形状结构做出决策,而不是点坐标,在欧几里得转化下是任意的。实验表明,IF-NETS在Shapenet中的3D对象重建中明显优于先前的工作,并获得更准确的3D人类重建。
While many works focus on 3D reconstruction from images, in this paper, we focus on 3D shape reconstruction and completion from a variety of 3D inputs, which are deficient in some respect: low and high resolution voxels, sparse and dense point clouds, complete or incomplete. Processing of such 3D inputs is an increasingly important problem as they are the output of 3D scanners, which are becoming more accessible, and are the intermediate output of 3D computer vision algorithms. Recently, learned implicit functions have shown great promise as they produce continuous reconstructions. However, we identified two limitations in reconstruction from 3D inputs: 1) details present in the input data are not retained, and 2) poor reconstruction of articulated humans. To solve this, we propose Implicit Feature Networks (IF-Nets), which deliver continuous outputs, can handle multiple topologies, and complete shapes for missing or sparse input data retaining the nice properties of recent learned implicit functions, but critically they can also retain detail when it is present in the input data, and can reconstruct articulated humans. Our work differs from prior work in two crucial aspects. First, instead of using a single vector to encode a 3D shape, we extract a learnable 3-dimensional multi-scale tensor of deep features, which is aligned with the original Euclidean space embedding the shape. Second, instead of classifying x-y-z point coordinates directly, we classify deep features extracted from the tensor at a continuous query point. We show that this forces our model to make decisions based on global and local shape structure, as opposed to point coordinates, which are arbitrary under Euclidean transformations. Experiments demonstrate that IF-Nets clearly outperform prior work in 3D object reconstruction in ShapeNet, and obtain significantly more accurate 3D human reconstructions.