论文标题
室内环境的有效多任务RGB-D场景分析
Efficient Multi-Task RGB-D Scene Analysis for Indoor Environments
论文作者
论文摘要
语义场景的理解对于在各种环境中作用的移动代理至关重要。尽管语义细分已经提供了大量信息,但缺少有关单个对象以及一般场景的详细信息,但对于许多真实的应用程序所需。但是,鉴于在移动平台上的计算和电池能力有限,分别解决多个任务是昂贵的,无法实时完成。在本文中,我们提出了一种有效的多任务方法,用于RGB-D场景分析〜(EMSANET),该方法同时执行语义和实例分割〜(PANOPTIC分割),实例方向估计和场景分类。我们表明,所有任务都可以在移动平台上实时使用单个神经网络完成,而不会降低性能 - 相比之下,各个任务能够彼此受益。为了评估我们的多任务方法,我们扩展了常见的RGB-D室内数据集NYUV2和SUNRGB-D的注释,例如分割和方向估计。据我们所知,我们是第一个为NYUV2和SUNRGB-D上的室内场景分析提供如此全面的多任务设置的结果。
Semantic scene understanding is essential for mobile agents acting in various environments. Although semantic segmentation already provides a lot of information, details about individual objects as well as the general scene are missing but required for many real-world applications. However, solving multiple tasks separately is expensive and cannot be accomplished in real time given limited computing and battery capabilities on a mobile platform. In this paper, we propose an efficient multi-task approach for RGB-D scene analysis~(EMSANet) that simultaneously performs semantic and instance segmentation~(panoptic segmentation), instance orientation estimation, and scene classification. We show that all tasks can be accomplished using a single neural network in real time on a mobile platform without diminishing performance - by contrast, the individual tasks are able to benefit from each other. In order to evaluate our multi-task approach, we extend the annotations of the common RGB-D indoor datasets NYUv2 and SUNRGB-D for instance segmentation and orientation estimation. To the best of our knowledge, we are the first to provide results in such a comprehensive multi-task setting for indoor scene analysis on NYUv2 and SUNRGB-D.