使用CRF学习判别功能，用于无监督的视频对象细分

论文标题

使用CRF学习判别功能，用于无监督的视频对象细分

Learning Discriminative Feature with CRF for Unsupervised Video Object Segmentation

论文作者

Zhen, Mingmin, Li, Shiwei, Zhou, Lei, Shang, Jiaxiang, Feng, Haoan, Fang, Tian, Quan, Long

论文摘要

在本文中，我们介绍了一个名为Incistative特征网络（DFNET）的新颖网络，以解决无监督的视频对象细分任务。为了捕获视频帧之间的固有相关性，我们从从全局角度揭示特征分布的输入图像中学习了判别特征（D-Features）。然后，使用D-Features在条件随机场（CRF）公式下建立与测试图像的所有特征的对应关系，该配方被利用以在像素之间执行一致性。该实验验证了DFNET的表现要优于最先进的方法，其平均得分为83.4％，在戴维斯-2016排行榜上排名第一，同时使用较少的参数并在推理阶段实现更有效的性能。我们进一步评估了FBMS数据集和视频显着数据集Visal上的DFNET，并达到了新的最新技术。为了进一步证明我们的框架的概括性，DFNET还应用于图像对象共段任务。我们在具有挑战性的数据集Pascal-VOC上进行实验，并观察DFNET的优势。彻底的实验验证了DFNET能够捕获和挖掘图像的潜在关系并发现共同的前景对象。

In this paper, we introduce a novel network, called discriminative feature network (DFNet), to address the unsupervised video object segmentation task. To capture the inherent correlation among video frames, we learn discriminative features (D-features) from the input images that reveal feature distribution from a global perspective. The D-features are then used to establish correspondence with all features of test image under conditional random field (CRF) formulation, which is leveraged to enforce consistency between pixels. The experiments verify that DFNet outperforms state-of-the-art methods by a large margin with a mean IoU score of 83.4% and ranks first on the DAVIS-2016 leaderboard while using much fewer parameters and achieving much more efficient performance in the inference phase. We further evaluate DFNet on the FBMS dataset and the video saliency dataset ViSal, reaching a new state-of-the-art. To further demonstrate the generalizability of our framework, DFNet is also applied to the image object co-segmentation task. We perform experiments on a challenging dataset PASCAL-VOC and observe the superiority of DFNet. The thorough experiments verify that DFNet is able to capture and mine the underlying relations of images and discover the common foreground objects.

下载PDF全文

下载文献需遵守相关版权规定

论文标题