论文标题
通过体系结构搜索发现多硬件移动模型
Discovering Multi-Hardware Mobile Models via Architecture Search
论文作者
论文摘要
硬件感知的神经体系结构设计主要集中于在单个硬件和模型开发复杂性上优化模型性能,在这种情况下,另一个重要因素是模型部署复杂性,在很大程度上被忽略了。在本文中,我们认为,对于可以在多个硬件上部署的应用程序,在部署的硬件上具有不同的单硬件模型,因此很难确保跨硬件和重复工程的一致输出用于调试和修复。为了最大程度地减少此类部署成本,我们提出了一种替代解决方案,即多硬件模型,其中开发了用于多个硬件的单个体系结构。通过周到的搜索空间设计,并在神经体系结构搜索中结合了所提出的多硬件指标,我们发现了多硬件模型,这些模型在平均和较差的情况下都可以在多个硬件中提供最先进的(SOTA)性能。为了在单个硬件上的性能,单个多硬件模型比在诸如GPU,DSP和EDGETPU之类的加速器上的SOTA性能产生的结果相似或更好,该型号是通过不同型号实现的,同时在移动CPU上具有相似的MobilenEtV3大型简约模型。
Hardware-aware neural architecture designs have been predominantly focusing on optimizing model performance on single hardware and model development complexity, where another important factor, model deployment complexity, has been largely ignored. In this paper, we argue that, for applications that may be deployed on multiple hardware, having different single-hardware models across the deployed hardware makes it hard to guarantee consistent outputs across hardware and duplicates engineering work for debugging and fixing. To minimize such deployment cost, we propose an alternative solution, multi-hardware models, where a single architecture is developed for multiple hardware. With thoughtful search space design and incorporating the proposed multi-hardware metrics in neural architecture search, we discover multi-hardware models that give state-of-the-art (SoTA) performance across multiple hardware in both average and worse case scenarios. For performance on individual hardware, the single multi-hardware model yields similar or better results than SoTA performance on accelerators like GPU, DSP and EdgeTPU which was achieved by different models, while having similar performance with MobilenetV3 Large Minimalistic model on mobile CPU.