在对象检测中移动符号

论文标题

在对象检测中移动符号

Shift Equivariance in Object Detection

论文作者

Manfredi, Marco, Wang, Yu

论文摘要

对小图像翻译的鲁棒性是对象探测器的高度理想属性。但是，最近的工作表明，基于CNN的分类器并不是不变的。目前尚不清楚这在多大程度上会影响对象检测，这主要是因为两者之间的建筑差异以及现代检测器预测空间的维度。为了评估对象检测模型端到端的偏移等同性，在本文中，我们提出了一个评估公制，该度量是基于对移位图像集中平均平均精度的下层和上限进行贪婪的搜索。我们的新指标表明，现代对象检测体系结构，无论是一个阶段还是两个阶段，基于锚或锚的两个阶段，对于一个像素转移到输入图像都敏感。此外，我们研究了该问题的几种可能的解决方案，包括从文献和新提出的文献中获取，用建议的指标量化了每个问题的有效性。我们的结果表明，这些方法都无法提供完全转移的均值。测量和分析不同模型的移位方差的程度以及可能因素的贡献，是能够设计方法来减轻甚至利用此类差异的方法的第一步。

Robustness to small image translations is a highly desirable property for object detectors. However, recent works have shown that CNN-based classifiers are not shift invariant. It is unclear to what extent this could impact object detection, mainly because of the architectural differences between the two and the dimensionality of the prediction space of modern detectors. To assess shift equivariance of object detection models end-to-end, in this paper we propose an evaluation metric, built upon a greedy search of the lower and upper bounds of the mean average precision on a shifted image set. Our new metric shows that modern object detection architectures, no matter if one-stage or two-stage, anchor-based or anchor-free, are sensitive to even one pixel shift to the input images. Furthermore, we investigate several possible solutions to this problem, both taken from the literature and newly proposed, quantifying the effectiveness of each one with the suggested metric. Our results indicate that none of these methods can provide full shift equivariance. Measuring and analyzing the extent of shift variance of different models and the contributions of possible factors, is a first step towards being able to devise methods that mitigate or even leverage such variabilities.

下载PDF全文

下载文献需遵守相关版权规定

论文标题