有效地纳入多个延迟目标

论文标题

有效地纳入多个延迟目标

Efficient Incorporation of Multiple Latency Targets in the Once-For-All Network

论文作者

Kumar, Vidhur, Szidon, Andrew

论文摘要

神经架构搜索已证明是一种自动化体系结构工程的有效方法。该领域的最新工作是寻找符合多个目标的体系结构，例如准确性和延迟，以有效地将其在不同的目标硬件上部署。一旦全方位（OFA）就是一种将培训和搜索的方法，并且能够找到针对不同延迟约束的高性能网络。但是，搜索阶段在合并多个延迟目标方面效率低下。在本文中，我们介绍了两种策略（自上而下和自下而上），它们使用温暖的启动和随机网络修剪来有效地纳入OFA网络中的多个延迟目标。我们根据当前的OFA实施评估了这些策略，并证明我们的策略提供了显着的运行时间绩效增长，而没有牺牲为每个延迟目标找到的子网的准确性。我们进一步证明，这些性能提高已概括为OFA网络使用的每个设计空间。

Neural Architecture Search has proven an effective method of automating architecture engineering. Recent work in the field has been to look for architectures subject to multiple objectives such as accuracy and latency to efficiently deploy them on different target hardware. Once-for-All (OFA) is one such method that decouples training and search and is able to find high-performance networks for different latency constraints. However, the search phase is inefficient at incorporating multiple latency targets. In this paper, we introduce two strategies (Top-down and Bottom-up) that use warm starting and randomized network pruning for the efficient incorporation of multiple latency targets in the OFA network. We evaluate these strategies against the current OFA implementation and demonstrate that our strategies offer significant running time performance gains while not sacrificing the accuracy of the subnetworks that were found for each latency target. We further demonstrate that these performance gains are generalized to every design space used by the OFA network.

下载PDF全文

下载文献需遵守相关版权规定

论文标题