论文标题
GPU高效网络的神经建筑设计
Neural Architecture Design for GPU-Efficient Networks
论文作者
论文摘要
许多关键任务系统基于GPU进行推断。它不仅需要高识别精度,而且需要响应时间的潜伏期较低。尽管许多研究致力于优化深层模型的有效推理的结构,但其中大多数并不利用\ textbf {Modern GPU}的架构进行快速推断,从而导致了次优性能。为了解决这个问题,我们提出了一个基于广泛的经验研究设计GPU有效网络的一般原则。该设计原则使我们能够通过简单且轻巧的方法有效地搜索GPU有效的网络结构,而不是大多数神经体系结构搜索(NAS)方法复杂且计算上昂贵的方法。根据提议的框架,我们设计了一个GPU有效的网络家族或基因。我们对多个GPU平台和推理引擎进行了广泛的评估。在ImageNet上实现$ \ geq 81.3 \%$ top-1的准确性,但Genet的$ 6.4 $ 6.4 $ $ $ 6.4 $倍的速度比GPU上的Efficiennet快。它还胜过大多数最先进的模型,这些模型在高精度制度下比有效网络更有效。我们的源代码和预训练的模型可从\ url {https://github.com/idstcv/gpu-felliced-networks}获得。
Many mission-critical systems are based on GPU for inference. It requires not only high recognition accuracy but also low latency in responding time. Although many studies are devoted to optimizing the structure of deep models for efficient inference, most of them do not leverage the architecture of \textbf{modern GPU} for fast inference, leading to suboptimal performance. To address this issue, we propose a general principle for designing GPU-efficient networks based on extensive empirical studies. This design principle enables us to search for GPU-efficient network structures effectively by a simple and lightweight method as opposed to most Neural Architecture Search (NAS) methods that are complicated and computationally expensive. Based on the proposed framework, we design a family of GPU-Efficient Networks, or GENets in short. We did extensive evaluations on multiple GPU platforms and inference engines. While achieving $\geq 81.3\%$ top-1 accuracy on ImageNet, GENet is up to $6.4$ times faster than EfficienNet on GPU. It also outperforms most state-of-the-art models that are more efficient than EfficientNet in high precision regimes. Our source code and pre-trained models are available from \url{https://github.com/idstcv/GPU-Efficient-Networks}.