查找尚未（尚未）深入学习推断所需的一切

论文标题

查找尚未（尚未）深入学习推断所需的一切

Look-ups are not (yet) all you need for deep learning inference

论文作者

McCarter, Calvin, Dronen, Nicholas

论文摘要

对矩阵乘法的快速近似可能会大大降低神经网络推断的成本。关于近似矩阵乘法的最新工作建议通过拟合训练数据的快速哈希函数来代替台式外观的昂贵乘法。在这项工作中，我们提出了对以前的工作的改进，该作品针对深度学习推理，在这些推论中，人们可以访问培训数据和固定（已经学习过的）模型重量矩阵。我们进一步提出了一个微调程序，以加速整个神经网络，同时最大程度地减少准确性损失。最后，我们在简单的图像分类任务上分析了提出的方法。尽管我们显示了先前工作的改进，但与精确的矩阵乘法相比，总体分类精度仍然大大降低。尽管取得了负面的结果，我们的工作还是为未来的努力指向以快速非线性散列方法加速内部产品的道路。

Fast approximations to matrix multiplication have the potential to dramatically reduce the cost of neural network inference. Recent work on approximate matrix multiplication proposed to replace costly multiplications with table-lookups by fitting a fast hash function from training data. In this work, we propose improvements to this previous work, targeted to the deep learning inference setting, where one has access to both training data and fixed (already learned) model weight matrices. We further propose a fine-tuning procedure for accelerating entire neural networks while minimizing loss in accuracy. Finally, we analyze the proposed method on a simple image classification task. While we show improvements to prior work, overall classification accuracy remains substantially diminished compared to exact matrix multiplication. Our work, despite this negative result, points the way towards future efforts to accelerate inner products with fast nonlinear hashing methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题