论文标题
难题:释放用于解决空间拼图的扩散模型的力量
PuzzleFusion: Unleashing the Power of Diffusion Models for Spatial Puzzle Solving
论文作者
论文摘要
本文基于用于解决空间拼图的扩散模型的端到端神经体系结构,尤其是拼图拼图和房间布置任务。例如,在后一项任务中,提出的系统“难题”将一组房间布局作为自上而下的视图中的多边形曲线,并通过估计其2D翻译和旋转来对齐房间布局,类似于求解房间布局的拼图拼图。该论文的一个令人惊讶的发现是,简单地使用扩散模型可以有效地解决这些具有挑战性的空间拼图任务作为有条件的生成过程。为了学习端到端的神经系统,本文介绍了具有基础真实布置的新数据集:1)2D Voronoi jigsaw数据集,这是一个合成的数据集,该数据集由2D点集的Voronoi图生成的零件; 2)Magicplan数据集,这是Magicplan从其生产管道中提供的真实数据集,其中零件是由房地产消费者增强现实应用程序构建的房间布局。定性和定量评估表明,我们的方法在所有任务中都优于竞争方法。
This paper presents an end-to-end neural architecture based on Diffusion Models for spatial puzzle solving, particularly jigsaw puzzle and room arrangement tasks. In the latter task, for instance, the proposed system "PuzzleFusion" takes a set of room layouts as polygonal curves in the top-down view and aligns the room layout pieces by estimating their 2D translations and rotations, akin to solving the jigsaw puzzle of room layouts. A surprising discovery of the paper is that the simple use of a Diffusion Model effectively solves these challenging spatial puzzle tasks as a conditional generation process. To enable learning of an end-to-end neural system, the paper introduces new datasets with ground-truth arrangements: 1) 2D Voronoi jigsaw dataset, a synthetic one where pieces are generated by Voronoi diagram of 2D pointset; and 2) MagicPlan dataset, a real one offered by MagicPlan from its production pipeline, where pieces are room layouts constructed by augmented reality App by real-estate consumers. The qualitative and quantitative evaluations demonstrate that our approach outperforms the competing methods by significant margins in all the tasks.