论文标题

双重跨注意力学习,用于细粒度的视觉分类和对象重新识别

Dual Cross-Attention Learning for Fine-Grained Visual Categorization and Object Re-Identification

论文作者

Zhu, Haowei, Ke, Wenjing, Li, Dong, Liu, Ji, Tian, Lu, Shan, Yi

论文摘要

最近,自我发挥的机制在各种NLP和CV任务中表现出令人印象深刻的性能,这可以帮助捕获顺序特征并得出全球信息。在这项工作中,我们探讨了如何扩展自我发场模块,以更好地学习识别细粒物体(例如不同的鸟类或人身份)的细微特征嵌入。为此,我们提出了一种双重交叉注意力学习(DCAL)算法,以与自我注意力学习进行协调。首先,我们建议全球本地跨注意事项(GLCA),以增强全球图像与局部高响应区域之间的相互作用,这可以帮助加强空间方面的歧视性线索以识别。其次,我们提出了配对跨注意(PWCA),以建立图像对之间的相互作用。 PWCA可以通过将另一个图像视为干扰器来正规化图像的注意力,并在推理过程中被删除。我们观察到,DCAL可以减少误导性的注意力并扩散注意力反应,以发现更多的互补部分以识别。我们对细粒的视觉分类和对象重新识别进行广泛的评估。实验表明,DCAL在最先进的方法上进行性能,并始终改善多个自我发项式基准,例如,在MSMT17上分别超过DEIT微小和VIT基本的MAP,分别超过2.8%和2.4%的MAP。

Recently, self-attention mechanisms have shown impressive performance in various NLP and CV tasks, which can help capture sequential characteristics and derive global information. In this work, we explore how to extend self-attention modules to better learn subtle feature embeddings for recognizing fine-grained objects, e.g., different bird species or person identities. To this end, we propose a dual cross-attention learning (DCAL) algorithm to coordinate with self-attention learning. First, we propose global-local cross-attention (GLCA) to enhance the interactions between global images and local high-response regions, which can help reinforce the spatial-wise discriminative clues for recognition. Second, we propose pair-wise cross-attention (PWCA) to establish the interactions between image pairs. PWCA can regularize the attention learning of an image by treating another image as distractor and will be removed during inference. We observe that DCAL can reduce misleading attentions and diffuse the attention response to discover more complementary parts for recognition. We conduct extensive evaluations on fine-grained visual categorization and object re-identification. Experiments demonstrate that DCAL performs on par with state-of-the-art methods and consistently improves multiple self-attention baselines, e.g., surpassing DeiT-Tiny and ViT-Base by 2.8% and 2.4% mAP on MSMT17, respectively.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源