迈向卷积层的奇异价值的实际控制

论文标题

迈向卷积层的奇异价值的实际控制

Towards Practical Control of Singular Values of Convolutional Layers

论文作者

Senderovich, Alexandra, Bulatova, Ekaterina, Obukhov, Anton, Rakhuba, Maxim

论文摘要

通常，卷积神经网络（CNN）易于训练，但是它们的基本特性（例如概括误差和对抗性鲁棒性）很难控制。最近的研究表明，卷积层的奇异值显着影响这种难以捉摸的特性，并提供了几种控制它们的方法。然而，这些方法呈现出棘手的计算挑战或诉诸粗略近似值。在本文中，我们提供了一种原则性的方法来减轻先前艺术的限制，而牺牲了层表达性的降低。我们的方法基于张量 - 训练分解。它保留了对卷积映射的实际奇异值的控制，同时提供结构稀疏和硬件友好的表示形式。我们通过我们的方法证明了现代CNN的提高特性，并分析了其对模型性能，校准和对抗性鲁棒性的影响。源代码可在以下网址获得：https：//github.com/whiteteadragon/practical_svd_conv

In general, convolutional neural networks (CNNs) are easy to train, but their essential properties, such as generalization error and adversarial robustness, are hard to control. Recent research demonstrated that singular values of convolutional layers significantly affect such elusive properties and offered several methods for controlling them. Nevertheless, these methods present an intractable computational challenge or resort to coarse approximations. In this paper, we offer a principled approach to alleviating constraints of the prior art at the expense of an insignificant reduction in layer expressivity. Our method is based on the tensor-train decomposition; it retains control over the actual singular values of convolutional mappings while providing structurally sparse and hardware-friendly representation. We demonstrate the improved properties of modern CNNs with our method and analyze its impact on the model performance, calibration, and adversarial robustness. The source code is available at: https://github.com/WhiteTeaDragon/practical_svd_conv

下载PDF全文

下载文献需遵守相关版权规定

论文标题