论文标题
NONSTAB:无监督的对象通过动态静态引导进行分割
DyStaB: Unsupervised Object Segmentation via Dynamic-Static Bootstrapping
论文作者
论文摘要
我们描述了一种无监督的方法,用于检测和段的现场场景图像部分,在某个时间点,它被视为连贯的整体,我们称为对象。我们的方法通过最大程度地限制片段之间的相互信息来对运动场进行第一次分区。然后,它使用段来学习可在静态图像中用于检测的对象模型。静态和动态模型由在自举策略中共同训练的深神经网络表示,这可以推断出以前看不见的对象。虽然训练过程需要运动,但可以在推理时在静态图像或视频上使用所得的对象分割网络。随着所见视频的数量的增长,人们看到越来越多的对象移动,启动了检测,然后将其作为新对象的常规化器,将我们的方法转变为无监督的持续学习来细分对象。在视频对象细分和显着对象检测中,我们的模型与最新的模型进行了比较。在测试的六个基准数据集中,尽管不需要手动注释,但我们的模型与使用像素级监督的模型相比,甚至与使用像素级监督的人相比。
We describe an unsupervised method to detect and segment portions of images of live scenes that, at some point in time, are seen moving as a coherent whole, which we refer to as objects. Our method first partitions the motion field by minimizing the mutual information between segments. Then, it uses the segments to learn object models that can be used for detection in a static image. Static and dynamic models are represented by deep neural networks trained jointly in a bootstrapping strategy, which enables extrapolation to previously unseen objects. While the training process requires motion, the resulting object segmentation network can be used on either static images or videos at inference time. As the volume of seen videos grows, more and more objects are seen moving, priming their detection, which then serves as a regularizer for new objects, turning our method into unsupervised continual learning to segment objects. Our models are compared to the state of the art in both video object segmentation and salient object detection. In the six benchmark datasets tested, our models compare favorably even to those using pixel-level supervision, despite requiring no manual annotation.