论文标题
使用自相似性和过滤以几种射击分类器中的对抗支持检测
Detection of Adversarial Supports in Few-shot Classifiers Using Self-Similarity and Filtering
论文作者
论文摘要
在有限的培训样本下,很少有射击分类器在有限的培训样本下表现出色,使其在用户提供的标签稀疏的应用中有用。他们独特的相对预测设置为新颖攻击提供了机会,例如针对未看到的测试样本所需的靶向支持集,而这些样品在其他机器学习设置中不可用。在这项工作中,我们提出了一种检测策略,以确定对抗性支持集,旨在破坏对某个类别的几个射击分类器的理解。我们通过介绍支持集的自相似性的概念以及使用支持的过滤来实现这一目标。我们的方法是攻击性不足的方法,我们是第一个探索对抗性检测的人,以获得我们的最佳知识,以提供几种射击分类器的支持集。尽管概念上的简单性,我们对迷你象征(MI)和CUB数据集的评估表现出良好的攻击检测性能,显示出很高的AUROC得分。我们表明,可以将自相似性和对抗性检测的过滤与其他过滤功能配对,从而构成可普遍的概念。
Few-shot classifiers excel under limited training samples, making them useful in applications with sparsely user-provided labels. Their unique relative prediction setup offers opportunities for novel attacks, such as targeting support sets required to categorise unseen test samples, which are not available in other machine learning setups. In this work, we propose a detection strategy to identify adversarial support sets, aimed at destroying the understanding of a few-shot classifier for a certain class. We achieve this by introducing the concept of self-similarity of a support set and by employing filtering of supports. Our method is attack-agnostic, and we are the first to explore adversarial detection for support sets of few-shot classifiers to the best of our knowledge. Our evaluation of the miniImagenet (MI) and CUB datasets exhibits good attack detection performance despite conceptual simplicity, showing high AUROC scores. We show that self-similarity and filtering for adversarial detection can be paired with other filtering functions, constituting a generalisable concept.