图像复制检测的自我监督的描述符

论文标题

图像复制检测的自我监督的描述符

A Self-Supervised Descriptor for Image Copy Detection

论文作者

Pizzi, Ed, Roy, Sreya Dutta, Ravindra, Sugosh Nagavara, Goyal, Priya, Douze, Matthijs

论文摘要

图像副本检测是内容审核的重要任务。我们介绍了SSCD，该模型以最新的自我监督对比训练目标为基础。我们通过更改体系结构和训练目标，包括匹配实例匹配的文献中的汇总操作员，并将对比度学习调整以增强图像的增强，将此方法调整为复制检测任务。我们的方法依赖于熵正规化项，促进了描述矢量之间的一致分离，我们证明这显着提高了拷贝检测准确性。我们的方法产生一个紧凑的描述量向量，适用于现实世界的Web量表应用程序。可以将来自背景图像分布的统计信息纳入描述符。在最近的Disc2021基准中，SSCD在所有设置中都表现出均优于基线复制检测模型和为图像分类而设计的自我监管的体系结构。例如，SSCD以48％的绝对功能超出SIMCLR描述符。代码可在https://github.com/facebookresearch/sscd-copy-detection上找到。

Image copy detection is an important task for content moderation. We introduce SSCD, a model that builds on a recent self-supervised contrastive training objective. We adapt this method to the copy detection task by changing the architecture and training objective, including a pooling operator from the instance matching literature, and adapting contrastive learning to augmentations that combine images. Our approach relies on an entropy regularization term, promoting consistent separation between descriptor vectors, and we demonstrate that this significantly improves copy detection accuracy. Our method produces a compact descriptor vector, suitable for real-world web scale applications. Statistical information from a background image distribution can be incorporated into the descriptor. On the recent DISC2021 benchmark, SSCD is shown to outperform both baseline copy detection models and self-supervised architectures designed for image classification by huge margins, in all settings. For example, SSCD out-performs SimCLR descriptors by 48% absolute. Code is available at https://github.com/facebookresearch/sscd-copy-detection.

下载PDF全文

下载文献需遵守相关版权规定

论文标题