论文标题
一个重量位,以统治它们
One Weight Bitwidth to Rule Them All
论文作者
论文摘要
深度转交响的重量量化已显示出有希望的结果,例如图像分类和语义分割,对于存储存储的应用尤为重要。但是,当目标量化而无需精确降解时,不同的任务可能会带有不同的位宽度。这为软件和硬件支持创造了复杂性,并且当考虑混合精确量化时,复杂性会累积,在这种情况下,每层的权重使用不同的位宽。我们的关键见解是,对最小位宽的优化无准确的降解不一定是最佳策略。这是因为一个人不能确定两个位宽度的最佳性,如果一个型号尺寸较小,而另一个具有更好的精度。在这项工作中,我们迈出了第一步,通过使用宽度 - 多层层将所有权重与其他模型大小对齐,以了解某些权重是否比其他重量更好。在这种设置下,有些令人惊讶的是,我们表明,与靶向零精度降解相同的混合精液量化相比,整个网络使用单个位宽可以达到更高的准确性。特别是,我们的结果表明,当通道的数量成为目标超参数时,整个网络中的单个重量位宽度显示出较高的模型压缩结果。
Weight quantization for deep ConvNets has shown promising results for applications such as image classification and semantic segmentation and is especially important for applications where memory storage is limited. However, when aiming for quantization without accuracy degradation, different tasks may end up with different bitwidths. This creates complexity for software and hardware support and the complexity accumulates when one considers mixed-precision quantization, in which case each layer's weights use a different bitwidth. Our key insight is that optimizing for the least bitwidth subject to no accuracy degradation is not necessarily an optimal strategy. This is because one cannot decide optimality between two bitwidths if one has a smaller model size while the other has better accuracy. In this work, we take the first step to understand if some weight bitwidth is better than others by aligning all to the same model size using a width-multiplier. Under this setting, somewhat surprisingly, we show that using a single bitwidth for the whole network can achieve better accuracy compared to mixed-precision quantization targeting zero accuracy degradation when both have the same model size. In particular, our results suggest that when the number of channels becomes a target hyperparameter, a single weight bitwidth throughout the network shows superior results for model compression.