DWM：一种可分解的Winograd方法，用于卷积加速

论文标题

DWM：一种可分解的Winograd方法，用于卷积加速

DWM: A Decomposable Winograd Method for Convolution Acceleration

论文作者

Huang, Di, Zhang, Xishan, Zhang, Rui, Zhi, Tian, He, Deyuan, Guo, Jiaming, Liu, Chang, Guo, Qi, Du, Zidong, Liu, Shaoli, Chen, Tianshi, Chen, Yunji

论文摘要

Winograd的最小过滤算法已被广泛用于卷积神经网络（CNN），以减少更快处理的乘法数量。但是，它仅在内核大小为3x3且大步为1的卷积中有效，因为它遭受了大于3x3的底甲的拖鞋和数值准确性问题的大大增加和数值准确性问题，并且在本文中大于1的卷积而失败。卷积。 DWM分解了大尺寸或大步前进的内核，以进一步应用Winograd方法，以较小的大小为1，以便DWM可以减少乘法的数量，同时保持数值准确性。它可以快速探索更大的内核大小和CNN中较大的步幅值，以提高高性能和准确性，甚至具有新的CNN的潜力。与原始的Winograd相比，提出的DWM能够以〜2的加速支持各种卷积，而不会影响数值精度。

Winograd's minimal filtering algorithm has been widely used in Convolutional Neural Networks (CNNs) to reduce the number of multiplications for faster processing. However, it is only effective on convolutions with kernel size as 3x3 and stride as 1, because it suffers from significantly increased FLOPs and numerical accuracy problem for kernel size larger than 3x3 and fails on convolution with stride larger than 1. In this paper, we propose a novel Decomposable Winograd Method (DWM), which breaks through the limitation of original Winograd's minimal filtering algorithm to a wide and general convolutions. DWM decomposes kernels with large size or large stride to several small kernels with stride as 1 for further applying Winograd method, so that DWM can reduce the number of multiplications while keeping the numerical accuracy. It enables the fast exploring of larger kernel size and larger stride value in CNNs for high performance and accuracy and even the potential for new CNNs. Comparing against the original Winograd, the proposed DWM is able to support all kinds of convolutions with a speedup of ~2, without affecting the numerical accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题