论文标题

ADWPNA:建筑驱动的重量预测神经体系结构搜索

ADWPNAS: Architecture-Driven Weight Prediction for Neural Architecture Search

论文作者

XuZhang, ChenjunZhou, BoGu

论文摘要

如何快速,准确地发现和评估模型的真实强度是神经体系结构搜索(NAS)的关键挑战之一。为了解决这个问题,我们提出了一种以建筑为导向的重量预测(ADWP)方法,用于神经体系结构搜索(NAS)。在我们的方法中,我们首先设计一个架构密集型搜索空间,然后通过输入随机编码架构参数来训练超网络。在训练有素的超网络中,可以在搜索空间中的神经体系结构很好地预测卷积内核的重量。因此,可以有效地评估目标体系结构而无需任何填充,从而使我们能够搜索theoptimalarchitecture inthestaceofgeneralNetworks(Macro-Search)。通过实际实验,我们评估了提出的AD-WPNA发现的模型的性能,结果表明,CIFAR-10的4.0 GPU小时可以完成一个搜索过程。此外,发现的模型获得了2.41%的测试误差,只有152万参数,它优于最佳现有模型。

How to discover and evaluate the true strength of models quickly and accurately is one of the key challenges in Neural Architecture Search (NAS). To cope with this problem, we propose an Architecture-Driven Weight Prediction (ADWP) approach for neural architecture search (NAS). In our approach, we first design an architecture-intensive search space and then train a HyperNetwork by inputting stochastic encoding architecture parameters. In the trained HyperNetwork, weights of convolution kernels can be well predicted for neural architectures in the search space. Consequently, the target architectures can be evaluated efficiently without any finetuning, thus enabling us to search fortheoptimalarchitectureinthespaceofgeneralnetworks (macro-search). Through real experiments, we evaluate the performance of the models discovered by the proposed AD-WPNAS and results show that one search procedure can be completed in 4.0 GPU hours on CIFAR-10. Moreover, the discovered model obtains a test error of 2.41% with only 1.52M parameters which is superior to the best existing models.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源