神经网络通过稀疏优化压缩

论文标题

神经网络通过稀疏优化压缩

Neural Network Compression Via Sparse Optimization

论文作者

Chen, Tianyi, Ji, Bo, Shi, Yixin, Ding, Tianyu, Fang, Biyi, Yi, Sheng, Tu, Xiao

论文摘要

深层神经网络（DNN）减少推理成本的压缩对于满足各种应用程序的现实部署要求变得越来越重要。关于网络压缩，已经进行了大量工作，而其中大多数是基于启发式规则的，或者通常不友好地将其纳入各种场景中。另一方面，稀疏优化产生稀疏解决方案自然符合压缩要求，但是由于对随机学习中稀疏优化的研究有限，因此很少探索其扩展和应用于模型压缩。在这项工作中，我们根据稀疏随机优化的最新进展提出了一个模型压缩框架。与现有的模型压缩技术相比，我们的方法是有效的，需要更少的额外工程工作来与不同的应用程序合并，并且在基准压缩任务上已被数值证明。特别是，与基线重型模型相比，对于CIFAR10的VGG16和RESNET50的VGG16的评估准确性相同，我们达到了高达7.2和2.9倍的插曲量。

The compression of deep neural networks (DNNs) to reduce inference cost becomes increasingly important to meet realistic deployment requirements of various applications. There have been a significant amount of work regarding network compression, while most of them are heuristic rule-based or typically not friendly to be incorporated into varying scenarios. On the other hand, sparse optimization yielding sparse solutions naturally fits the compression requirement, but due to the limited study of sparse optimization in stochastic learning, its extension and application onto model compression is rarely well explored. In this work, we propose a model compression framework based on the recent progress on sparse stochastic optimization. Compared to existing model compression techniques, our method is effective and requires fewer extra engineering efforts to incorporate with varying applications, and has been numerically demonstrated on benchmark compression tasks. Particularly, we achieve up to 7.2 and 2.9 times FLOPs reduction with the same level of evaluation accuracy on VGG16 for CIFAR10 and ResNet50 for ImageNet compared to the baseline heavy models, respectively.

下载PDF全文

下载文献需遵守相关版权规定

论文标题