通过类选择性和维度链接平均和最差的扰动鲁棒性鲁棒性

论文标题

通过类选择性和维度链接平均和最差的扰动鲁棒性鲁棒性

Linking average- and worst-case perturbation robustness via class selectivity and dimensionality

论文作者

Leavitt, Matthew L., Morcos, Ari

论文摘要

已知代表性的稀疏性会影响深度神经网络（DNNS）输入扰动的鲁棒性，但对表示语义内容如何影响鲁棒性的语义内容却鲜为人知。类选择性 - 单位响应跨数据类别或维度的可变性 - 是量化语义表示稀疏性的一种方法。鉴于最近的证据表明可能不需要班级选择性，在某些情况下可能会损害概括，我们研究它是否也赋予了鲁棒性（或脆弱性）对输入数据的扰动。我们发现，正规化的网络对班级选择性的水平较低，对平均案例（自然主义）扰动更为强大，而具有较高类选择性的网络更脆弱。相比之下，类选择性提高了多种类型的最坏情况（即白盒对手）扰动的鲁棒性，这表明降低类选择性对平均案例扰动有帮助，但对最坏情况扰动有害。为了解释这种差异，我们研究了网络表示的维度：我们发现，早期表示的维度与网络的阶级选择性成反比，并且对抗性样本会导致早期维度比损坏的样本更大。此外，与低选择性网络相比，高选择性网络中的样本和单元之间的输入单位梯度的变化更大。这些结果得出的结论是，与高选择性制度相比，单位更稳定地参与低选择性制度，从而有效地产生了更大的攻击表面，从而易受最坏情况的扰动。

Representational sparsity is known to affect robustness to input perturbations in deep neural networks (DNNs), but less is known about how the semantic content of representations affects robustness. Class selectivity-the variability of a unit's responses across data classes or dimensions-is one way of quantifying the sparsity of semantic representations. Given recent evidence that class selectivity may not be necessary for, and in some cases can impair generalization, we investigate whether it also confers robustness (or vulnerability) to perturbations of input data. We found that networks regularized to have lower levels of class selectivity were more robust to average-case (naturalistic) perturbations, while networks with higher class selectivity are more vulnerable. In contrast, class selectivity increases robustness to multiple types of worst-case (i.e. white box adversarial) perturbations, suggesting that while decreasing class selectivity is helpful for average-case perturbations, it is harmful for worst-case perturbations. To explain this difference, we studied the dimensionality of the networks' representations: we found that the dimensionality of early-layer representations is inversely proportional to a network's class selectivity, and that adversarial samples cause a larger increase in early-layer dimensionality than corrupted samples. Furthermore, the input-unit gradient is more variable across samples and units in high-selectivity networks compared to low-selectivity networks. These results lead to the conclusion that units participate more consistently in low-selectivity regimes compared to high-selectivity regimes, effectively creating a larger attack surface and hence vulnerability to worst-case perturbations.

下载PDF全文

下载文献需遵守相关版权规定

论文标题