论文标题
在样式转移模型中演奏彩票
Playing Lottery Tickets in Style Transfer Models
论文作者
论文摘要
风格转移取得了巨大的成功,并由于其灵活的应用程序方面引起了学术和工业社区的广泛关注。但是,对基于VGG的大型自动编码器的依赖性导致现有样式转移模型具有很高的参数复杂性,从而将其应用程序限制在资源约束设备上。与许多其他任务相比,探索样式转移模型的压缩程度较低。最近,彩票票证假说(LTH)在寻找极稀疏的匹配子网方面表现出了巨大的潜力,这些子网在孤立培训时可以比原始的完整网络实现,甚至比原始的完整网络更好。在这项工作中,我们首次进行一项经验研究,以验证样式转移模型中是否还存在这种可训练的匹配子网。具体来说,我们将两个最受欢迎的风格转移模型,即Adain和Sanet作为主要测试床,分别代表基于全球和本地转换的样式转移方法。我们进行了广泛的实验和全面的分析,并得出以下结论。 (1)与修复VGG编码器相比,样式传输模型可以从一起培训整个网络中受益更多。 (2)使用迭代幅度修剪,我们发现ADAIN的匹配子网稀疏性为89.2%,在SANET中的稀疏性为73.7%,这表明样式转移型号也可以播放彩票票。 (3)还应修剪特征转换模块以获得更稀疏的模型,而不会影响匹配子网的存在和质量。 (4)除了Adain和Senet外,LST,Manet,Adaattn和McCnet等其他型号还可以玩彩票,这表明LTH可以推广到各种样式的转移模型。
Style transfer has achieved great success and attracted a wide range of attention from both academic and industrial communities due to its flexible application scenarios. However, the dependence on a pretty large VGG-based autoencoder leads to existing style transfer models having high parameter complexities, which limits their applications on resource-constrained devices. Compared with many other tasks, the compression of style transfer models has been less explored. Recently, the lottery ticket hypothesis (LTH) has shown great potential in finding extremely sparse matching subnetworks which can achieve on par or even better performance than the original full networks when trained in isolation. In this work, we for the first time perform an empirical study to verify whether such trainable matching subnetworks also exist in style transfer models. Specifically, we take two most popular style transfer models, i.e., AdaIN and SANet, as the main testbeds, which represent global and local transformation based style transfer methods respectively. We carry out extensive experiments and comprehensive analysis, and draw the following conclusions. (1) Compared with fixing the VGG encoder, style transfer models can benefit more from training the whole network together. (2) Using iterative magnitude pruning, we find the matching subnetworks at 89.2% sparsity in AdaIN and 73.7% sparsity in SANet, which demonstrates that style transfer models can play lottery tickets too. (3) The feature transformation module should also be pruned to obtain a much sparser model without affecting the existence and quality of the matching subnetworks. (4) Besides AdaIN and SANet, other models such as LST, MANet, AdaAttN and MCCNet can also play lottery tickets, which shows that LTH can be generalized to various style transfer models.