论文标题
S2-NET:跨模式图像的自学指导性特征表示学习
S2-Net: Self-supervision Guided Feature Representation Learning for Cross-Modality Images
论文作者
论文摘要
结合跨模式图像的各自优势可以弥补单个模式中缺乏信息,这吸引了研究人员越来越关注多模式图像匹配任务。同时,由于跨模式图像对之间的巨大外观差异,通常无法使对应的特征表示尽可能接近。在这封信中,我们设计了一个跨模式特征表示网络S2-NET,该网络基于最近成功的检测和描述管道,该管道最初是针对可见图像提出的,但适合使用交叉模式图像对。为了解决随之而来的优化困难问题,我们通过精心设计的损失功能介绍了自我监督的学习,以指导培训,而无需丢弃原始优势。这种新颖的策略模拟了同一模态的图像对,这也是训练跨模式图像的有用指南。值得注意的是,它不需要其他数据,但可以显着提高性能,甚至对于检测和描述管道的所有方法都是可行的。进行了广泛的实验,以评估我们提出的策略的性能与手工制作和基于深度学习的方法相比。结果表明,我们优雅的对监督和自我监督的学习的优化优化优于道路和RGB-NIR数据集上的最先进。
Combining the respective advantages of cross-modality images can compensate for the lack of information in the single modality, which has attracted increasing attention of researchers into multi-modal image matching tasks. Meanwhile, due to the great appearance differences between cross-modality image pairs, it often fails to make the feature representations of correspondences as close as possible. In this letter, we design a cross-modality feature representation learning network, S2-Net, which is based on the recently successful detect-and-describe pipeline, originally proposed for visible images but adapted to work with cross-modality image pairs. To solve the consequent problem of optimization difficulties, we introduce self-supervised learning with a well-designed loss function to guide the training without discarding the original advantages. This novel strategy simulates image pairs in the same modality, which is also a useful guide for the training of cross-modality images. Notably, it does not require additional data but significantly improves the performance and is even workable for all methods of the detect-and-describe pipeline. Extensive experiments are conducted to evaluate the performance of the strategy we proposed, compared to both handcrafted and deep learning-based methods. Results show that our elegant formulation of combined optimization of supervised and self-supervised learning outperforms state-of-the-arts on RoadScene and RGB-NIR datasets.