重量固定网络

论文标题

重量固定网络

Weight Fixing Networks

论文作者

Subia-Waud, Christopher, Dasmahapatra, Srinandan

论文摘要

深度学习模型的现代迭代包含数百万（数十亿）独特的参数，每个参数都以B位数为代表。大众压缩神经网络（例如修剪和定量）的尝试表明，许多参数都是多余的，我们可以在不阻碍性能的情况下以少于B-bits（定量）（定量）删除（修剪）或表达。在这里，我们希望进一步最大程度地减少网络的信息内容。我们希望无需通道或层面编码，而是寻求无损的整个网络定量，以最大程度地减少网络中唯一参数的熵和数量。我们提出了一种新方法，我们将其称为“重量固定网络（WFN）”，我们设计了以实现四个模型结果目标：i）很少有独特的重量，ii）低渗透重量编码，iii）独特的权重值，可与硬件乘法的节能版本相融合，以及iv）无效任务。其中一些目标是矛盾的。为了最好地平衡这些冲突，我们结合了一些小说（和一些良好的技巧）。一个新颖的正规化项，（i，ii）将聚类成本视为相对距离变化（i，ii，iv），而重点是重新使用重量（i，iii）。我们的成像网实验表明，与SOTA定量方法相比，使用少56倍的独特权重和低1.9倍的重量空间熵的无损压缩。

Modern iterations of deep learning models contain millions (billions) of unique parameters, each represented by a b-bit number. Popular attempts at compressing neural networks (such as pruning and quantisation) have shown that many of the parameters are superfluous, which we can remove (pruning) or express with less than b-bits (quantisation) without hindering performance. Here we look to go much further in minimising the information content of networks. Rather than a channel or layer-wise encoding, we look to lossless whole-network quantisation to minimise the entropy and number of unique parameters in a network. We propose a new method, which we call Weight Fixing Networks (WFN) that we design to realise four model outcome objectives: i) very few unique weights, ii) low-entropy weight encodings, iii) unique weight values which are amenable to energy-saving versions of hardware multiplication, and iv) lossless task-performance. Some of these goals are conflicting. To best balance these conflicts, we combine a few novel (and some well-trodden) tricks; a novel regularisation term, (i, ii) a view of clustering cost as relative distance change (i, ii, iv), and a focus on whole-network re-use of weights (i, iii). Our Imagenet experiments demonstrate lossless compression using 56x fewer unique weights and a 1.9x lower weight-space entropy than SOTA quantisation approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题