论文标题
神经元合并:补偿修剪的神经元
Neuron Merging: Compensating for Pruned Neurons
论文作者
论文摘要
网络修剪被广泛用于减轻和加速神经网络模型。结构化的网络修剪会丢弃整个神经元或过滤器,从而导致准确的损失。在这项工作中,我们提出了一种新的神经元合并概念,该概念适用于完全连接的层和卷积层,该层弥补了由于修剪的神经元/过滤器而导致的信息损失。神经元合并始于将原始权重分解为两个矩阵/张量。其中一个变成了当前层的新权重,另一个是我们将缩放矩阵命名的,指导神经元的组合。如果激活函数依赖,则可以在某些条件下将缩放矩阵吸收到下一层中,以补偿去除的神经元。我们还提出了一种无数据和廉价的方法,以利用神经元之间的余弦相似性来分解权重。与具有相同拓扑的修剪模型相比,我们的合并模型更好地保留了原始模型的输出特征图;因此,它在不进行微调的情况下保持修剪后的精度。我们证明了方法对各种模型架构和数据集的网络修剪的有效性。例如,对于CIFAR-10上的VGG-16,我们达到了93.16%的精度,同时减少了总参数的64%,而无需进行任何微调。代码可以在此处找到:https://github.com/friffshipkim/neuron-merging
Network pruning is widely used to lighten and accelerate neural network models. Structured network pruning discards the whole neuron or filter, leading to accuracy loss. In this work, we propose a novel concept of neuron merging applicable to both fully connected layers and convolution layers, which compensates for the information loss due to the pruned neurons/filters. Neuron merging starts with decomposing the original weights into two matrices/tensors. One of them becomes the new weights for the current layer, and the other is what we name a scaling matrix, guiding the combination of neurons. If the activation function is ReLU, the scaling matrix can be absorbed into the next layer under certain conditions, compensating for the removed neurons. We also propose a data-free and inexpensive method to decompose the weights by utilizing the cosine similarity between neurons. Compared to the pruned model with the same topology, our merged model better preserves the output feature map of the original model; thus, it maintains the accuracy after pruning without fine-tuning. We demonstrate the effectiveness of our approach over network pruning for various model architectures and datasets. As an example, for VGG-16 on CIFAR-10, we achieve an accuracy of 93.16% while reducing 64% of total parameters, without any fine-tuning. The code can be found here: https://github.com/friendshipkim/neuron-merging