论文标题
对象检测的多目标进化移动体系结构搜索
Multi-Objective Evolutionary for Object Detection Mobile Architectures Search
论文作者
论文摘要
最近,神经体系结构搜索在移动设备的分类任务上取得了巨大成功。通常在图像分类任务上获得用于对象检测的骨干网络。但是,由于图像和对象检测任务之间的差距,通过分类任务搜索的架构是次优的。尽管工作重点介绍了骨干网络架构搜索移动设备对象检测的搜索是有限的,这主要是因为骨干总是需要昂贵的Imagenet预训练。因此,有必要研究网络架构搜索移动设备对象检测的方法,而无需昂贵的预训练。在这项工作中,我们提出了一种移动对象检测骨干网络体系结构搜索算法,该算法是一种基于NAS方案的非主导分类的进化优化方法。它可以快速搜索以在某些约束中获得骨干网络体系结构。它更好地解决了次优线性组合精度和计算成本的问题。提出的方法可以通过重量映射技术搜索具有不同深度,宽度或扩展大小的骨干网络,从而更有效地将NAS用于移动设备检测任务是可能的。在我们的实验中,我们验证了提出的方法对Yolox-Lite的有效性,Yolox-Lite是目标检测框架的轻量级版本。在类似的计算复杂性下,我们搜索的骨干网络体系结构的准确性比Mobiledet高2.0%。我们改进的骨干网络可以减少计算工作,同时提高对象检测网络的准确性。为了证明其有效性,已经进行了一系列消融研究,并详细分析了工作机制。
Recently, Neural architecture search has achieved great success on classification tasks for mobile devices. The backbone network for object detection is usually obtained on the image classification task. However, the architecture which is searched through the classification task is sub-optimal because of the gap between the task of image and object detection. As while work focuses on backbone network architecture search for mobile device object detection is limited, mainly because the backbone always requires expensive ImageNet pre-training. Accordingly, it is necessary to study the approach of network architecture search for mobile device object detection without expensive pre-training. In this work, we propose a mobile object detection backbone network architecture search algorithm which is a kind of evolutionary optimized method based on non-dominated sorting for NAS scenarios. It can quickly search to obtain the backbone network architecture within certain constraints. It better solves the problem of suboptimal linear combination accuracy and computational cost. The proposed approach can search the backbone networks with different depths, widths, or expansion sizes via a technique of weight mapping, making it possible to use NAS for mobile devices detection tasks a lot more efficiently. In our experiments, we verify the effectiveness of the proposed approach on YoloX-Lite, a lightweight version of the target detection framework. Under similar computational complexity, the accuracy of the backbone network architecture we search for is 2.0% mAP higher than MobileDet. Our improved backbone network can reduce the computational effort while improving the accuracy of the object detection network. To prove its effectiveness, a series of ablation studies have been carried out and the working mechanism has been analyzed in detail.