VIFI-LOC：使用GAN与摄像机通信的GAN多模式的行人定位

论文标题

VIFI-LOC：使用GAN与摄像机通信的GAN多模式的行人定位

ViFi-Loc: Multi-modal Pedestrian Localization using GAN with Camera-Phone Correspondences

论文作者

Liu, Hansi, Dana, Kristin, Gruteser, Marco, Lu, Hongsheng

论文摘要

在智能城市和车辆到全部（V2X）系统中，获取行人的准确地点对于交通安全至关重要。当前系统采用相机和无线传感器通过传感器融合来检测和估计人们的位置。但是，当多模式数据不关联时，标准融合算法将变得不适当。例如，行人不在摄像机视野之外，或者缺少来自摄像机模式的数据。为了应对这一挑战并为行人提供更准确的位置估计，我们提出了一个生成的对抗网络（GAN）体系结构。在培训期间，它了解了行人摄像机数据对应关系之间的基本联系。在推断期间，它仅基于由GPS，IMU和FTM组成的行人电话数据生成精致的位置估计。结果表明，我们的GAN在5个不同的室外场景中以1至2米的定位错误产生3D坐标。我们进一步表明，所提出的模型支持自学习。生成的坐标可以与行人的边界盒坐标相关联，以获取其他相机数据对应关系。这允许在推断期间自动数据收集。在扩展的数据集上进行了微调后，本地化精度提高了26％。

In Smart City and Vehicle-to-Everything (V2X) systems, acquiring pedestrians' accurate locations is crucial to traffic safety. Current systems adopt cameras and wireless sensors to detect and estimate people's locations via sensor fusion. Standard fusion algorithms, however, become inapplicable when multi-modal data is not associated. For example, pedestrians are out of the camera field of view, or data from camera modality is missing. To address this challenge and produce more accurate location estimations for pedestrians, we propose a Generative Adversarial Network (GAN) architecture. During training, it learns the underlying linkage between pedestrians' camera-phone data correspondences. During inference, it generates refined position estimations based only on pedestrians' phone data that consists of GPS, IMU and FTM. Results show that our GAN produces 3D coordinates at 1 to 2 meter localization error across 5 different outdoor scenes. We further show that the proposed model supports self-learning. The generated coordinates can be associated with pedestrian's bounding box coordinates to obtain additional camera-phone data correspondences. This allows automatic data collection during inference. After fine-tuning on the expanded dataset, localization accuracy is improved by up to 26%.

下载PDF全文

下载文献需遵守相关版权规定

论文标题