正则化在塑造体重和节点修剪依赖性和动力学中的作用

论文标题

正则化在塑造体重和节点修剪依赖性和动力学中的作用

The Role of Regularization in Shaping Weight and Node Pruning Dependency and Dynamics

论文作者

Ben-Guigui, Yael, Goldberger, Jacob, Riklin-Raviv, Tammy

论文摘要

降低深神经网络能力的紧迫需要刺激了网络稀释方法的发展及其分析。虽然经常提到$ l_1 $和$ l_0 $正规化的能力，但在这种情况下很少讨论$ l_2 $正则化。我们提出了一个新颖的框架，以从概率函数中取样，从而有利于较小的权重零。此外，我们研究了$ l_1 $和$ l_2 $正规化对节点修剪动态的贡献，同时优化了重量修剪。然后，我们在流行分类模型上与重量衰减规则使用者一起使用，以删除MLP中的50％的节点进行MNIST分类，在MNIST分类中删除50％的节点，VGG-16中的60％用于CIFAR10分类中的过滤器，并在医疗图像中删除60％的频道中的频道和50％的频道中的频道中的50％，以实例分类的频率和50％的50％neT of vggar10分类，并在流行分类模型中使用重量衰减框架的有效性。检测。对于这些节点预元的网络，我们还提供了竞争性的重量修剪结果，这些结果仅比原始密集的网络略低。

The pressing need to reduce the capacity of deep neural networks has stimulated the development of network dilution methods and their analysis. While the ability of $L_1$ and $L_0$ regularization to encourage sparsity is often mentioned, $L_2$ regularization is seldom discussed in this context. We present a novel framework for weight pruning by sampling from a probability function that favors the zeroing of smaller weights. In addition, we examine the contribution of $L_1$ and $L_2$ regularization to the dynamics of node pruning while optimizing for weight pruning. We then demonstrate the effectiveness of the proposed stochastic framework when used together with a weight decay regularizer on popular classification models in removing 50% of the nodes in an MLP for MNIST classification, 60% of the filters in VGG-16 for CIFAR10 classification, and on medical image models in removing 60% of the channels in a U-Net for instance segmentation and 50% of the channels in CNN model for COVID-19 detection. For these node-pruned networks, we also present competitive weight pruning results that are only slightly less accurate than the original, dense networks.

下载PDF全文

下载文献需遵守相关版权规定

论文标题