在充满活力的世界中开放长尾认可

论文标题

在充满活力的世界中开放长尾认可

Open Long-Tailed Recognition in a Dynamic World

论文作者

Liu, Ziwei, Miao, Zhongqi, Zhan, Xiaohang, Wang, Jiayun, Gong, Boqing, Yu, Stella X.

论文摘要

现实世界中的数据通常显示出长尾且开放式的（带有看不见的类）分布。实践识别系统必须在多数（头）和少数族裔（尾巴）阶级之间取得平衡，在整个分布中进行概括，并承认新颖的班级（开放式课程）。我们将开放的长尾识别++（OLTR ++）定义为从这种自然分布的数据中学习，并优化了包括已知和开放类的平衡测试集的分类精度。 OLTR ++在一种集成算法中处理不平衡的分类，很少的学习，开放式识别和积极学习，而现有的分类方法通常仅着眼于一个或两个方面，并且在整个频谱中提供了很差的。主要挑战是：1）如何在头和尾巴之间共享视觉知识，2）如何减少尾巴和开放式阶级之间的混淆，以及3）如何用学习的知识积极地探索开放的课程。我们的算法OLTR ++将图像映射到特征空间，以便视觉概念可以通过记忆关联机制和学习的指标（动态元元素）相互关联，这些指标既尊重可见类的封闭世界分类又承认开放类的新颖性。此外，我们提出了一个基于视觉记忆的主动学习方案，该方案学会以数据效率的方式识别开放类，以进行将来的扩展。在三个大型开放式长尾数据集上，我们从Imagenet（以对象为中心），位置（以场景为中心）和MS1M（以面部为中心）数据进行了策划，以及三个标准基准（CIFAR-10-LT，CIFAR-100-LT，CIFAR-100-LT，and Inaturist-18），我们的方法，作为统一的框架，一致性地表现出了竞争性的表演。值得注意的是，我们的方法还显示出积极探索开放阶级和对少数群体的公平分析的强大潜力。

Real world data often exhibits a long-tailed and open-ended (with unseen classes) distribution. A practical recognition system must balance between majority (head) and minority (tail) classes, generalize across the distribution, and acknowledge novelty upon the instances of unseen classes (open classes). We define Open Long-Tailed Recognition++ (OLTR++) as learning from such naturally distributed data and optimizing for the classification accuracy over a balanced test set which includes both known and open classes. OLTR++ handles imbalanced classification, few-shot learning, open-set recognition, and active learning in one integrated algorithm, whereas existing classification approaches often focus only on one or two aspects and deliver poorly over the entire spectrum. The key challenges are: 1) how to share visual knowledge between head and tail classes, 2) how to reduce confusion between tail and open classes, and 3) how to actively explore open classes with learned knowledge. Our algorithm, OLTR++, maps images to a feature space such that visual concepts can relate to each other through a memory association mechanism and a learned metric (dynamic meta-embedding) that both respects the closed world classification of seen classes and acknowledges the novelty of open classes. Additionally, we propose an active learning scheme based on visual memory, which learns to recognize open classes in a data-efficient manner for future expansions. On three large-scale open long-tailed datasets we curated from ImageNet (object-centric), Places (scene-centric), and MS1M (face-centric) data, as well as three standard benchmarks (CIFAR-10-LT, CIFAR-100-LT, and iNaturalist-18), our approach, as a unified framework, consistently demonstrates competitive performance. Notably, our approach also shows strong potential for the active exploration of open classes and the fairness analysis of minority groups.

下载PDF全文

下载文献需遵守相关版权规定

论文标题