使用雷达点云对图形表示的跨模式学习，以进行远程手势识别

论文标题

使用雷达点云对图形表示的跨模式学习，以进行远程手势识别

Cross-modal Learning of Graph Representations using Radar Point Cloud for Long-Range Gesture Recognition

论文作者

Hazra, Souvik, Feng, Hao, Kiprit, Gamze Naz, Stephan, Michael, Servadei, Lorenzo, Wille, Robert, Weigel, Robert, Santra, Avik

论文摘要

手势识别是最直观的互动方式之一，并特别关注了人类计算机互动。雷达传感器具有多种固有特性，例如它们在低照明，恶劣的天气条件以及低成本和紧凑的能力上，使其对手势识别解决方案非常有利。但是，大多数文献工作都集中在范围有限的解决方案上，该解决方案低于一米。我们为远程（1M-2M）手势识别解决方案提供了一种新颖的体系结构，该解决方案利用了从相机点云到60 GHz FMCW雷达点云的基于点云的交叉学习方法，该方法可以在抑制噪声的同时学习更好的表示。我们使用动态图CNN（DGCNN）的变体进行交叉学习，使我们能够在本地和全局级别的点之间建模关系，并为时间动力学建模BI-LSTM网络。在“实验结果”部分中，我们证明了模型的五个手势及其概括能力的总体准确性为98.4％。

Gesture recognition is one of the most intuitive ways of interaction and has gathered particular attention for human computer interaction. Radar sensors possess multiple intrinsic properties, such as their ability to work in low illumination, harsh weather conditions, and being low-cost and compact, making them highly preferable for a gesture recognition solution. However, most literature work focuses on solutions with a limited range that is lower than a meter. We propose a novel architecture for a long-range (1m - 2m) gesture recognition solution that leverages a point cloud-based cross-learning approach from camera point cloud to 60-GHz FMCW radar point cloud, which allows learning better representations while suppressing noise. We use a variant of Dynamic Graph CNN (DGCNN) for the cross-learning, enabling us to model relationships between the points at a local and global level and to model the temporal dynamics a Bi-LSTM network is employed. In the experimental results section, we demonstrate our model's overall accuracy of 98.4% for five gestures and its generalization capability.

下载PDF全文

下载文献需遵守相关版权规定

论文标题