在移动增强现实的背景下的对象检测

论文标题

在移动增强现实的背景下的对象检测

Object Detection in the Context of Mobile Augmented Reality

论文作者

Li, Xiang, Tian, Yuan, Zhang, Fuyao, Quan, Shuxue, Xu, Yi

论文摘要

在过去的几年中，已经开发了许多深层神经网络（DNN）模型和框架，以解决RGB图像实时对象检测的问题。普通的对象检测方法仅从图像中处理信息，并且在环境和环境的规模方面不理会相机姿势。另一方面，移动增强现实（AR）框架可以在场景中连续跟踪相机的姿势，并可以使用视觉惯用探测仪（VIO）估算环境的正确规模。在本文中，我们提出了一种新颖的方法，该方法将VIO的几何信息与来自对象检测器的语义信息结合在一起，以改善移动设备上对象检测的性能。我们的方法包括三个组件：（1）图像方向校正方法，（2）基于比例尺的过滤方法，以及（3）在线语义映射。每个组件都利用基于VIO的AR框架的不同特征。我们在Android手机上使用Arcore和SSD Mobilenet模型实现了AR-增强功能。为了验证我们的方法，我们在从12个房间尺度的AR会话中手动标记了图像序列中的对象。结果表明，我们的方法可以在数据集中提高通用对象探测器的准确性12％。

In the past few years, numerous Deep Neural Network (DNN) models and frameworks have been developed to tackle the problem of real-time object detection from RGB images. Ordinary object detection approaches process information from the images only, and they are oblivious to the camera pose with regard to the environment and the scale of the environment. On the other hand, mobile Augmented Reality (AR) frameworks can continuously track a camera's pose within the scene and can estimate the correct scale of the environment by using Visual-Inertial Odometry (VIO). In this paper, we propose a novel approach that combines the geometric information from VIO with semantic information from object detectors to improve the performance of object detection on mobile devices. Our approach includes three components: (1) an image orientation correction method, (2) a scale-based filtering approach, and (3) an online semantic map. Each component takes advantage of the different characteristics of the VIO-based AR framework. We implemented the AR-enhanced features using ARCore and the SSD Mobilenet model on Android phones. To validate our approach, we manually labeled objects in image sequences taken from 12 room-scale AR sessions. The results show that our approach can improve on the accuracy of generic object detectors by 12% on our dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题