论文标题
探索域自适应对象检测的分类正规化
Exploring Categorical Regularization for Domain Adaptive Object Detection
论文作者
论文摘要
在本文中,我们解决了域自适应对象检测问题,其中主要挑战在于源和目标域之间的重要域间隙。以前的工作旨在清楚地调整图像级和实例级别的变化,以最终使域差异最小化。但是,他们仍然忽略了与关键图像区域和跨域的重要实例相匹配的,这将严重影响域转移缓解。在这项工作中,我们提出了一个简单但有效的分类正规化框架,以减轻此问题。它可以在一系列域的自适应更快的R-CNN方法上作为插件组件应用,该方法对于处理域自适应检测而言是突出的。具体而言,通过在检测主链上集成图像级多标签分类器,我们可以获得与分类信息相对应的稀疏但至关重要的图像区域,这是由于分类方式的弱定位能力。同时,在实例级别上,我们利用图像级预测(通过分类器)和实例级预测(通过检测头)作为正规化因子之间的分类一致性,以自动寻找目标域的硬对准实例。各种域移动方案的广泛实验表明,我们的方法比原始域自适应更快的R-CNN检测器获得了显着的性能增长。此外,定性可视化和分析可以证明我们方法在靶向域适应的关键区域/实例上的参与能力。我们的代码是开源的,可在\ url {https://github.com/megvii-nanjing/cr-da-det}上找到。
In this paper, we tackle the domain adaptive object detection problem, where the main challenge lies in significant domain gaps between source and target domains. Previous work seeks to plainly align image-level and instance-level shifts to eventually minimize the domain discrepancy. However, they still overlook to match crucial image regions and important instances across domains, which will strongly affect domain shift mitigation. In this work, we propose a simple but effective categorical regularization framework for alleviating this issue. It can be applied as a plug-and-play component on a series of Domain Adaptive Faster R-CNN methods which are prominent for dealing with domain adaptive detection. Specifically, by integrating an image-level multi-label classifier upon the detection backbone, we can obtain the sparse but crucial image regions corresponding to categorical information, thanks to the weakly localization ability of the classification manner. Meanwhile, at the instance level, we leverage the categorical consistency between image-level predictions (by the classifier) and instance-level predictions (by the detection head) as a regularization factor to automatically hunt for the hard aligned instances of target domains. Extensive experiments of various domain shift scenarios show that our method obtains a significant performance gain over original Domain Adaptive Faster R-CNN detectors. Furthermore, qualitative visualization and analyses can demonstrate the ability of our method for attending on the key regions/instances targeting on domain adaptation. Our code is open-source and available at \url{https://github.com/Megvii-Nanjing/CR-DA-DET}.