论文标题
U-Boost NAS:实现利用的可区分神经体系结构搜索
U-Boost NAS: Utilization-Boosted Differentiable Neural Architecture Search
论文作者
论文摘要
在目标平台中优化资源利用是在DNN推理过程中实现高性能的关键。尽管已经提出了用于推理延迟,内存足迹和能耗的优化,但先前的硬件感知神经体系结构搜索(NAS)方法已省略了资源利用率,从而阻止DNN充分利用目标推断平台。有效,准确地对资源利用进行建模是具有挑战性的,尤其是对于广泛使用的基于阵列的推理加速器(例如Google TPU)。在这项工作中,我们提出了一种新颖的硬件感知NAS框架,该框架不仅可以优化任务准确性和推理延迟,而且还用于资源利用率。我们还建议并验证推理加速器中资源利用的新计算模型。通过使用建议的NAS框架和提议的资源利用模型,我们与先前的硬件感知的NAS方法相比,在CIFAR-10和Imagenet-100数据集上获得相似或提高的图像分类精度,我们实现了DNN推断的2.8-4倍加速。
Optimizing resource utilization in target platforms is key to achieving high performance during DNN inference. While optimizations have been proposed for inference latency, memory footprint, and energy consumption, prior hardware-aware neural architecture search (NAS) methods have omitted resource utilization, preventing DNNs to take full advantage of the target inference platforms. Modeling resource utilization efficiently and accurately is challenging, especially for widely-used array-based inference accelerators such as Google TPU. In this work, we propose a novel hardware-aware NAS framework that does not only optimize for task accuracy and inference latency, but also for resource utilization. We also propose and validate a new computational model for resource utilization in inference accelerators. By using the proposed NAS framework and the proposed resource utilization model, we achieve 2.8 - 4x speedup for DNN inference compared to prior hardware-aware NAS methods while attaining similar or improved accuracy in image classification on CIFAR-10 and Imagenet-100 datasets.