提前赢得彩票：高效的早期网络修剪

论文标题

提前赢得彩票：高效的早期网络修剪

Winning the Lottery Ahead of Time: Efficient Early Network Pruning

论文作者

Rachwan, John, Zügner, Daniel, Charpentier, Bertrand, Geisler, Simon, Ayle, Morgane, Günnemann, Stephan

论文摘要

修剪是稀疏深度神经网络的任务，最近受到了越来越多的关注。尽管最先进的修剪方法提取了高度稀疏的模型，但它们忽略了两个主要挑战：（1）寻找这些稀疏模型的过程通常非常昂贵；（2）非结构化的修剪在GPU记忆，训练时间或碳排放方面没有提供好处。我们提出了通过梯度流量保存（早期CROP）提出的早期压缩，该压缩在训练挑战（1）之前有效提取最先进的稀疏模型（1），并且可以以结构化的方式应用来应对挑战（2）。这使我们能够在商品GPU上训练稀疏的网络，该商品GPU的密集版本太大，从而节省了成本并减少了硬件要求。我们从经验上表明，对于许多任务（包括分类，回归）和域（包括计算机视觉，自然语言处理和增强学习），早期杂交的表现优于许多基线（包括分类，回归）和域。早期杂种可与密集的训练相当，同时超过修剪基线。

Pruning, the task of sparsifying deep neural networks, received increasing attention recently. Although state-of-the-art pruning methods extract highly sparse models, they neglect two main challenges: (1) the process of finding these sparse models is often very expensive; (2) unstructured pruning does not provide benefits in terms of GPU memory, training time, or carbon emissions. We propose Early Compression via Gradient Flow Preservation (EarlyCroP), which efficiently extracts state-of-the-art sparse models before or early in training addressing challenge (1), and can be applied in a structured manner addressing challenge (2). This enables us to train sparse networks on commodity GPUs whose dense versions would be too large, thereby saving costs and reducing hardware requirements. We empirically show that EarlyCroP outperforms a rich set of baselines for many tasks (incl. classification, regression) and domains (incl. computer vision, natural language processing, and reinforcment learning). EarlyCroP leads to accuracy comparable to dense training while outperforming pruning baselines.

下载PDF全文

下载文献需遵守相关版权规定

论文标题