论文标题
超代表为生成模型:取样看不见的神经网络权重
Hyper-Representations as Generative Models: Sampling Unseen Neural Network Weights
论文作者
论文摘要
给定模型动物园的神经网络权重的学习表示是一个新兴而具有挑战性的领域,从模型检查到神经体系结构搜索或知识蒸馏,具有许多潜在的应用。最近,一个在模型动物园进行训练的自动编码器能够学习一个超代理,该代理捕获了动物园中模型的内在和外在特性。在这项工作中,我们扩展了超代理,以生成用于采样新的模型权重。我们提出的是层损失归一化,我们证明,这是基于超陈述拓扑的生成高性能模型和几种采样方法的关键。使用我们的方法生成的模型是多种多样的,性能的,并且能够胜过强大的基准,从而在下游任务上进行了评估:初始化,集合抽样和转移学习。我们的结果表明,知识聚集从模型动物园到新模型的潜力通过超代理,从而为新的研究方向铺平了途径。
Learning representations of neural network weights given a model zoo is an emerging and challenging area with many potential applications from model inspection, to neural architecture search or knowledge distillation. Recently, an autoencoder trained on a model zoo was able to learn a hyper-representation, which captures intrinsic and extrinsic properties of the models in the zoo. In this work, we extend hyper-representations for generative use to sample new model weights. We propose layer-wise loss normalization which we demonstrate is key to generate high-performing models and several sampling methods based on the topology of hyper-representations. The models generated using our methods are diverse, performant and capable to outperform strong baselines as evaluated on several downstream tasks: initialization, ensemble sampling and transfer learning. Our results indicate the potential of knowledge aggregation from model zoos to new models via hyper-representations thereby paving the avenue for novel research directions.