在多输出分类问题中强制执行语义相干嵌入的四倍体损失

论文标题

在多输出分类问题中强制执行语义相干嵌入的四倍体损失

A Quadruplet Loss for Enforcing Semantically Coherent Embeddings in Multi-output Classification Problems

论文作者

Proença, Hugo, Yaghoubi, Ehsan, Alirezazadeh, Pendar

论文摘要

本文介绍了多输出分类问题中学习语义相干特征嵌入的一个目标函数，即当响应变量的尺寸高于一个时。特别是，我们考虑了视觉监视环境中的身份检索和软生物识别标记的问题，这些问题一直在吸引日益增长的兴趣。受三重态损失[34]功能的启发，我们提出了一个概括，即：1）定义了一个指标，该度量考虑了元素对之间的同意标签的数量；和2）忽略锚的概念，根据对a，b，c，d距离约束的d（a，b）<d（a，b）<d（a1，b）的概念，根据对a，b，c，d距离约束。作为三胞胎损失配方，我们的提议也使正对之间的距离很小，但同时明确强制强制执行其他对之间的距离直接对应于它们的相似性，以同意标签。这种产生的特征具有嵌入式的嵌入式，它们之间的质心及其语义描述之间具有很强的对应关系，即，在这些元素上，元素更接近具有共享某些标签的其他元素，而不是与具有完全脱节标签的元素成员资格。作为实际效果，可以将拟议的损失视为特别适合于基于简单的规则（例如K-Neighbours）执行关节粗糙（软标签） +（ID）推断，这是相对于以前的相关损失函数的新颖性。同样，与其三胞胎对应的相反，拟议的损失对于采矿学习实例的任何苛刻的标准（例如半硬对成对）都是不可知的。我们的实验是在五个不同的数据集（Biodi，LFW，IJB-A，Megaface和PETA）中进行的，并验证了我们的假设，显示出高度有希望的结果。

This paper describes one objective function for learning semantically coherent feature embeddings in multi-output classification problems, i.e., when the response variables have dimension higher than one. In particular, we consider the problems of identity retrieval and soft biometrics labelling in visual surveillance environments, which have been attracting growing interests. Inspired by the triplet loss [34] function, we propose a generalization that: 1) defines a metric that considers the number of agreeing labels between pairs of elements; and 2) disregards the notion of anchor, replacing d(A1, A2) < d(A1, B) by d(A, B) < d(C, D), for A, B, C, D distance constraints, according to the number of agreeing labels between pairs. As the triplet loss formulation, our proposal also privileges small distances between positive pairs, but at the same time explicitly enforces that the distance between other pairs corresponds directly to their similarity in terms of agreeing labels. This yields feature embeddings with a strong correspondence between the classes centroids and their semantic descriptions, i.e., where elements are closer to others that share some of their labels than to elements with fully disjoint labels membership. As practical effect, the proposed loss can be seen as particularly suitable for performing joint coarse (soft label) + fine (ID) inference, based on simple rules as k-neighbours, which is a novelty with respect to previous related loss functions. Also, in opposition to its triplet counterpart, the proposed loss is agnostic with regard to any demanding criteria for mining learning instances (such as the semi-hard pairs). Our experiments were carried out in five different datasets (BIODI, LFW, IJB-A, Megaface and PETA) and validate our assumptions, showing highly promising results.

下载PDF全文

下载文献需遵守相关版权规定

论文标题