论文标题
少数图像识别中的数据集偏差
Dataset Bias in Few-shot Image Recognition
论文作者
论文摘要
少数图像识别(FSIR)的目的是通过从培训数据(基本类别)中利用可转移的知识来识别具有少量注释样本的新型类别。当前的大多数研究都认为,可转移的知识可以很好地用于识别新类别。但是,这种可转移能力可能会受到数据集偏差的影响,并且此问题很少受到研究。此外,大多数少数射击的学习方法都偏向不同的数据集,这也是一个重要的问题,需要深入研究。在本文中,我们首先研究了从基本类别中学到的可转移功能的影响。具体来说,我们使用相关性来衡量基本类别和新颖类别之间的关系。基本类别的分布是通过实例密度和类别多样性描述的。 FSIR模型从相关的培训数据中学习了更好可转移的知识。在相关数据中,密集的实例或不同类别可以进一步丰富学习的知识。 Imagnet不同子数据集的实验结果表明类别相关性,实例密度和类别多样性可以描述基本类别的可转移偏差。其次,我们研究了数据集结构的不同数据集的性能差异以及不同的少量学习方法。具体而言,我们介绍了图像复杂性,概念内视觉一致性和概念内视觉相似性,以量化数据集结构的特征。我们使用这些定量特征和四种少量学习方法来分析五个不同数据集上的性能差异。基于实验分析,从数据集结构的角度和少量学习方法的角度获得了一些有见地的观察。我们希望这些观察结果对于指导未来的FSIR研究很有用。
The goal of few-shot image recognition (FSIR) is to identify novel categories with a small number of annotated samples by exploiting transferable knowledge from training data (base categories). Most current studies assume that the transferable knowledge can be well used to identify novel categories. However, such transferable capability may be impacted by the dataset bias, and this problem has rarely been investigated before. Besides, most of few-shot learning methods are biased to different datasets, which is also an important issue that needs to be investigated deeply. In this paper, we first investigate the impact of transferable capabilities learned from base categories. Specifically, we use the relevance to measure relationships between base categories and novel categories. Distributions of base categories are depicted via the instance density and category diversity. The FSIR model learns better transferable knowledge from relevant training data. In the relevant data, dense instances or diverse categories can further enrich the learned knowledge. Experimental results on different sub-datasets of ImagNet demonstrate category relevance, instance density and category diversity can depict transferable bias from base categories. Second, we investigate performance differences on different datasets from dataset structures and different few-shot learning methods. Specifically, we introduce image complexity, intra-concept visual consistency, and inter-concept visual similarity to quantify characteristics of dataset structures. We use these quantitative characteristics and four few-shot learning methods to analyze performance differences on five different datasets. Based on the experimental analysis, some insightful observations are obtained from the perspective of both dataset structures and few-shot learning methods. We hope these observations are useful to guide future FSIR research.