您在那里有一点：使用目光，头姿势和手指指向汽车内部的对象选择

论文标题

您在那里有一点：使用目光，头姿势和手指指向汽车内部的对象选择

You Have a Point There: Object Selection Inside an Automobile Using Gaze, Head Pose and Finger Pointing

论文作者

Aftab, Abdul Rafey, von der Beeck, Michael, Feld, Michael

论文摘要

汽车行业中复杂的用户互动是一个快速的新兴话题。空中手势和语音已经有许多用于驾驶员互动的应用。此外，正在开发多模式方法来利用多个传感器的使用来增加优势。在本文中，我们提出了一种基于机器学习的快速，实用的多模式融合方法，用于选择汽车车辆中的各种控制模块。考虑到的方式是目光，头姿势和手指指向手势。语音仅作为融合的触发因素。以前，单个模态已被多次用于识别用户的指向方向。但是，我们证明了如何将多个输入融合在一起以增强识别性能。此外，我们将不同的深度神经网络体系结构与传统的机器学习方法进行比较，即支持向量回归和随机森林，并使用深度学习显示指向方向精度的增强。结果表明，使用多模式输入具有很大的潜力，这些输入可以应用于车辆中的更多用例。

Sophisticated user interaction in the automotive industry is a fast emerging topic. Mid-air gestures and speech already have numerous applications for driver-car interaction. Additionally, multimodal approaches are being developed to leverage the use of multiple sensors for added advantages. In this paper, we propose a fast and practical multimodal fusion method based on machine learning for the selection of various control modules in an automotive vehicle. The modalities taken into account are gaze, head pose and finger pointing gesture. Speech is used only as a trigger for fusion. Single modality has previously been used numerous times for recognition of the user's pointing direction. We, however, demonstrate how multiple inputs can be fused together to enhance the recognition performance. Furthermore, we compare different deep neural network architectures against conventional Machine Learning methods, namely Support Vector Regression and Random Forests, and show the enhancements in the pointing direction accuracy using deep learning. The results suggest a great potential for the use of multimodal inputs that can be applied to more use cases in the vehicle.

下载PDF全文

下载文献需遵守相关版权规定

论文标题