通过聚类来增强开放世界的认可

论文标题

通过聚类来增强开放世界的认可

Boosting Deep Open World Recognition by Clustering

论文作者

Fontanel, Dario, Cermelli, Fabio, Mancini, Massimiliano, Bulò, Samuel Rota, Ricci, Elisa, Caputo, Barbara

论文摘要

尽管卷积神经网络在机器人愿景中带来了重大进步，但它们的能力通常仅限于封闭的世界情景，其中要识别的语义概念的数量由可用的训练集确定。由于几乎不可能在单个训练中捕获现实世界中存在的所有可能的语义概念，因此我们需要打破封闭的世界假设，使我们的机器人能够在开放的世界中采取行动。为了提供这种能力，机器人视觉系统应该能够（i）确定一个实例是否不属于已知类别集（即开放集识别），并且（ii）扩展其知识以随着时间的推移（即增量学习）学习新类。在这项工作中，我们展示了如何通过新的损失表述来增强深度开放世界识别算法的性能，从而实施了全球到本地特定特定功能的本地聚类。特别是，第一个损失术语，即全局聚类，迫使网络映射样品靠近其属于的类质心，而第二个局部聚类（本地聚类）以这种方式塑造了表示空间，以至于同一类的样本在表示邻居的邻居属于其他类别的邻居中更靠近表示空间。此外，我们提出了一种学习特定于班级拒绝阈值的策略，而不是像以前的工作那样启发性地估计一个全球阈值。 RGB-D对象和Core50数据集的实验显示了我们方法的有效性。

While convolutional neural networks have brought significant advances in robot vision, their ability is often limited to closed world scenarios, where the number of semantic concepts to be recognized is determined by the available training set. Since it is practically impossible to capture all possible semantic concepts present in the real world in a single training set, we need to break the closed world assumption, equipping our robot with the capability to act in an open world. To provide such ability, a robot vision system should be able to (i) identify whether an instance does not belong to the set of known categories (i.e. open set recognition), and (ii) extend its knowledge to learn new classes over time (i.e. incremental learning). In this work, we show how we can boost the performance of deep open world recognition algorithms by means of a new loss formulation enforcing a global to local clustering of class-specific features. In particular, a first loss term, i.e. global clustering, forces the network to map samples closer to the class centroid they belong to while the second one, local clustering, shapes the representation space in such a way that samples of the same class get closer in the representation space while pushing away neighbours belonging to other classes. Moreover, we propose a strategy to learn class-specific rejection thresholds, instead of heuristically estimating a single global threshold, as in previous works. Experiments on RGB-D Object and Core50 datasets show the effectiveness of our approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题