论文标题
Alphagan:完全微分的体系结构搜索生成对抗网络
AlphaGAN: Fully Differentiable Architecture Search for Generative Adversarial Networks
论文作者
论文摘要
生成对抗网络(GAN)被提出为最小值游戏问题,从而通过对抗歧视者来试图通过对抗性学习来接近实际数据分布。固有的问题复杂性构成了提高生成网络性能的挑战。在这项工作中,我们旨在通过将自动化体系结构搜索的最新进度纳入GAN,从网络体系结构的角度来促进模型学习。为此,我们为被称为Alphagan的生成对抗网络提出了一个完全可区分的搜索框架。搜索过程被形式化为解决双层最小优化问题,其中外部级别的目标旨在寻求适当的网络体系结构,以在发电机上进行纯净的NASH平衡,以及在内部水平上使用传统的GAN损失优化的歧视器网络参数。整个优化通过以完全可区分的方式交替最大程度地最大程度地最大程度地减少两级目标,从而使架构搜索能够在巨大的搜索空间中完成,从而执行一阶方法。在CIFAR-10和STL-10数据集上进行了广泛的实验表明,我们的算法只能在由大约2组成的搜索空间中使用3-GPU时获得高性能体系结构? 1011可能的配置。我们还提供了有关搜索过程的行为和搜索体系结构的属性的全面分析,这将使对生成模型的架构进行进一步的研究。预验证的模型和代码可在https://github.com/yuesongtian/alphagan上找到。
Generative Adversarial Networks (GANs) are formulated as minimax game problems, whereby generators attempt to approach real data distributions by virtue of adversarial learning against discriminators. The intrinsic problem complexity poses the challenge to enhance the performance of generative networks. In this work, we aim to boost model learning from the perspective of network architectures, by incorporating recent progress on automated architecture search into GANs. To this end, we propose a fully differentiable search framework for generative adversarial networks, dubbed alphaGAN. The searching process is formalized as solving a bi-level minimax optimization problem, in which the outer-level objective aims for seeking a suitable network architecture towards pure Nash Equilibrium conditioned on the generator and the discriminator network parameters optimized with a traditional GAN loss in the inner level. The entire optimization performs a first-order method by alternately minimizing the two-level objective in a fully differentiable manner, enabling architecture search to be completed in an enormous search space. Extensive experiments on CIFAR-10 and STL-10 datasets show that our algorithm can obtain high-performing architectures only with 3-GPU hours on a single GPU in the search space comprised of approximate 2 ? 1011 possible configurations. We also provide a comprehensive analysis on the behavior of the searching process and the properties of searched architectures, which would benefit further research on architectures for generative models. Pretrained models and codes are available at https://github.com/yuesongtian/AlphaGAN.