检测具有降低神经网络的足球：在约束硬件方案下多个体系结构的比较

论文标题

检测具有降低神经网络的足球：在约束硬件方案下多个体系结构的比较

Detecting soccer balls with reduced neural networks: a comparison of multiple architectures under constrained hardware scenarios

论文作者

Meneghetti, Douglas De Rizzo, Homem, Thiago Pedro Donadon, de Oliveira, Jonas Henrique Renolfi, da Silva, Isaac Jesus, Perico, Danilo Hernani, Bianchi, Reinaldo Augusto da Costa

论文摘要

实现最新检测精度的对象检测技术采用卷积神经网络，实现了图形处理单元中最佳性能。一些硬件系统（例如移动机器人）在约束的硬件情况下运行，但仍会受益于对象检测功能。已经提出了多种网络模型，可以通过降低的架构和更精简的操作来达到可比的精度。这项工作是由为移动机器人的足球团队创建对象检测系统的动机，在足球球检测的特定任务中，对针对约束硬件环境的神经网络的最新建议进行了比较研究。我们在使用移动机器人捕获的带注释的图像数据集中训练具有不同基础架构的MobilenetV2和MobilenetV3模型的多个开放实现，以及具有不同基础架构的模型，以及Yolov3，Tinyyolov3，Yolov4和Tinyyolov4。然后，我们在测试数据集上的平均平均精度及其在不同分辨率的视频中，在受约束和不受限制的硬件配置下的视频中。结果表明，MobileNetV3模型仅在受约束的方案中在MAP和推理时间之间取决于良好的权衡，而高宽度乘数的MobilenetV2适用于服务器端推断。 Yolo模型在其官方实施中不适合推断CPU。

Object detection techniques that achieve state-of-the-art detection accuracy employ convolutional neural networks, implemented to have optimal performance in graphics processing units. Some hardware systems, such as mobile robots, operate under constrained hardware situations, but still benefit from object detection capabilities. Multiple network models have been proposed, achieving comparable accuracy with reduced architectures and leaner operations. Motivated by the need to create an object detection system for a soccer team of mobile robots, this work provides a comparative study of recent proposals of neural networks targeted towards constrained hardware environments, in the specific task of soccer ball detection. We train multiple open implementations of MobileNetV2 and MobileNetV3 models with different underlying architectures, as well as YOLOv3, TinyYOLOv3, YOLOv4 and TinyYOLOv4 in an annotated image data set captured using a mobile robot. We then report their mean average precision on a test data set and their inference times in videos of different resolutions, under constrained and unconstrained hardware configurations. Results show that MobileNetV3 models have a good trade-off between mAP and inference time in constrained scenarios only, while MobileNetV2 with high width multipliers are appropriate for server-side inference. YOLO models in their official implementations are not suitable for inference in CPUs.

下载PDF全文

下载文献需遵守相关版权规定

论文标题