监督对比原型学习：增强免费稳健神经网络

论文标题

监督对比原型学习：增强免费稳健神经网络

Supervised Contrastive Prototype Learning: Augmentation Free Robust Neural Network

论文作者

Fostiropoulos, Iordanis, Itti, Laurent

论文摘要

深神经网络（DNN）的输入空间的转换导致特征空间的意外变化。几乎相同的输入（例如对抗性示例）可以具有显着远距离的特征表示。相反，分布外（OOD）样本可以与训练集样品具有高度相似的特征表示。我们对经过分类分类头训练的DNN的理论分析表明，由分类问题大小限制的不灵活的logit空间是缺乏$ \ textit {robultness} $的根本原因之一。我们的第二个观察结果是，DNNS过度适合培训增强技术，并且不学习$ \ textit {nuance nuance nuance nuance nuance} $表示。受原型和对比度学习框架的最新成功启发，我们提出了一个培训框架，$ \ textbf {监督对比的原型学习} $（SCPL）。我们将N-PAIR对比度损失与相同和相反类的原型使用，并用$ \ textbf {原型分类头} $（PCH）代替分类分类头。我们的方法是$ \ textIt {示例效率} $，不需要$ \ textit {sample Mining} $，可以在任何现有的DNN上实现，而无需修改其体系结构，并与其他培训增强技术结合使用。我们从经验上评估了我们方法对分布和对抗样本的$ \ textbf {clean} $鲁棒性。我们的框架以$ \ textit {robustness} $的方式优于其他最先进的对比和原型学习方法。

Transformations in the input space of Deep Neural Networks (DNN) lead to unintended changes in the feature space. Almost perceptually identical inputs, such as adversarial examples, can have significantly distant feature representations. On the contrary, Out-of-Distribution (OOD) samples can have highly similar feature representations to training set samples. Our theoretical analysis for DNNs trained with a categorical classification head suggests that the inflexible logit space restricted by the classification problem size is one of the root causes for the lack of $\textit{robustness}$. Our second observation is that DNNs over-fit to the training augmentation technique and do not learn $\textit{nuance invariant}$ representations. Inspired by the recent success of prototypical and contrastive learning frameworks for both improving robustness and learning nuance invariant representations, we propose a training framework, $\textbf{Supervised Contrastive Prototype Learning}$ (SCPL). We use N-pair contrastive loss with prototypes of the same and opposite classes and replace a categorical classification head with a $\textbf{Prototype Classification Head}$ (PCH). Our approach is $\textit{sample efficient}$, does not require $\textit{sample mining}$, can be implemented on any existing DNN without modification to their architecture, and combined with other training augmentation techniques. We empirically evaluate the $\textbf{clean}$ robustness of our method on out-of-distribution and adversarial samples. Our framework outperforms other state-of-the-art contrastive and prototype learning approaches in $\textit{robustness}$.

下载PDF全文

下载文献需遵守相关版权规定

论文标题