Oimnet ++：人搜索的原型归一化和本地化学习学习

论文标题

Oimnet ++：人搜索的原型归一化和本地化学习学习

OIMNet++: Prototypical Normalization and Localization-aware Learning for Person Search

论文作者

Lee, Sanghoon, Oh, Youngmin, Baek, Donghyeon, Lee, Junghyup, Ham, Bumsub

论文摘要

我们解决了人搜索的任务，即从一组原始场景图像中进行本地化和重新识别查询人员。最近的方法通常是基于Oimnet（在人搜索的先驱工作）上构建的，该作品学习了执行检测和人重新识别（REID）任务的联合人物代表。为了获得表示形式，它们从行人建议中提取特征，然后将其投射到具有L2归一化的Hypersphere上。这些方法还结合了所有积极的建议，这些建议足以与地面真理重叠，同样可以学习REID的人代表。我们发现1）L2归一化而不考虑特征分布会退化人的歧视性，而2）正面建议通常也描绘了背景混乱和人的重叠，这可能会将噪声特征编码为人的表现。在本文中，我们介绍了解决上述限制的Oimnet ++。为此，我们引入了一个新颖的归一化层，称为Protonorm，该层校准了行人建议的特征，同时考虑了人ID的长尾分布，使L2归一化的人表示具有歧视性。我们还提出了一种本地化感知的特征学习计划，该计划鼓励提出更好的建议，以在学习歧视性表示方面做出更多贡献。对标准人员搜索基准的实验结果和分析证明了Oimnet ++的有效性。

We address the task of person search, that is, localizing and re-identifying query persons from a set of raw scene images. Recent approaches are typically built upon OIMNet, a pioneer work on person search, that learns joint person representations for performing both detection and person re-identification (reID) tasks. To obtain the representations, they extract features from pedestrian proposals, and then project them on a unit hypersphere with L2 normalization. These methods also incorporate all positive proposals, that sufficiently overlap with the ground truth, equally to learn person representations for reID. We have found that 1) the L2 normalization without considering feature distributions degenerates the discriminative power of person representations, and 2) positive proposals often also depict background clutter and person overlaps, which could encode noisy features to person representations. In this paper, we introduce OIMNet++ that addresses the aforementioned limitations. To this end, we introduce a novel normalization layer, dubbed ProtoNorm, that calibrates features from pedestrian proposals, while considering a long-tail distribution of person IDs, enabling L2 normalized person representations to be discriminative. We also propose a localization-aware feature learning scheme that encourages better-aligned proposals to contribute more in learning discriminative representations. Experimental results and analysis on standard person search benchmarks demonstrate the effectiveness of OIMNet++.

下载PDF全文

下载文献需遵守相关版权规定

论文标题