SceneGen：使用场景图检验的生成上下文场景增强

论文标题

SceneGen：使用场景图检验的生成上下文场景增强

SceneGen: Generative Contextual Scene Augmentation using Scene Graph Priors

论文作者

Keshavarzi, Mohammad, Parikh, Aakash, Zhai, Xiyu, Mao, Melody, Caldas, Luisa, Yang, Allen Y.

论文摘要

空间计算体验受用户的现实环境限制。在这样的经验中，将虚拟对象扩展到现有场景需要上下文方法，其中避免了几何冲突，并且在目标环境中将与其他对象的功能和合理关系保持在目标环境中。然而，由于用户环境的复杂性和多样性，自动计算适应场景上下文的虚拟内容的理想位置被认为是一项具有挑战性的任务。在这个问题的推动下，在本文中，我们介绍了SceneGen，这是一个生成上下文增强框架，可预测现有场景中的虚拟对象位置和方向。 SceneGen以语义分段的场景为输入，并输出用于放置虚拟内容的位置和定向概率图。我们制定了一种新颖的空间场景图表示，该图形表示对象，对象组和房间之间的显式拓扑特性。我们认为，提供明确和直观的功能在空间计算设置的内容创建和用户互动中起着重要的作用，这种质量在隐式模型中未捕获。我们使用内核密度估计（KDE）来构建使用从现实世界3D扫描数据提取的先前空间场景图训练的多元条件知识模型。为了进一步捕获定向属性，我们开发了一个快速姿势注释工具，以扩展具有定向标签的当前现实世界数据集。最后，为了展示我们的系统，我们开发了一个增强现实应用程序，可以实时地上下文增强对象。

Spatial computing experiences are constrained by the real-world surroundings of the user. In such experiences, augmenting virtual objects to existing scenes require a contextual approach, where geometrical conflicts are avoided, and functional and plausible relationships to other objects are maintained in the target environment. Yet, due to the complexity and diversity of user environments, automatically calculating ideal positions of virtual content that is adaptive to the context of the scene is considered a challenging task. Motivated by this problem, in this paper we introduce SceneGen, a generative contextual augmentation framework that predicts virtual object positions and orientations within existing scenes. SceneGen takes a semantically segmented scene as input, and outputs positional and orientational probability maps for placing virtual content. We formulate a novel spatial Scene Graph representation, which encapsulates explicit topological properties between objects, object groups, and the room. We believe providing explicit and intuitive features plays an important role in informative content creation and user interaction of spatial computing settings, a quality that is not captured in implicit models. We use kernel density estimation (KDE) to build a multivariate conditional knowledge model trained using prior spatial Scene Graphs extracted from real-world 3D scanned data. To further capture orientational properties, we develop a fast pose annotation tool to extend current real-world datasets with orientational labels. Finally, to demonstrate our system in action, we develop an Augmented Reality application, in which objects can be contextually augmented in real-time.

下载PDF全文

下载文献需遵守相关版权规定

论文标题