论文标题
快速模板匹配和更新视频对象跟踪和细分
Fast Template Matching and Update for Video Object Tracking and Segmentation
论文作者
论文摘要
在本文中,我们要解决的主要任务是在一系列框架上进行的多功能半监督视频对象分割,其中仅提供了第一框盒级的基础真相。基于检测的算法被广泛采用以处理此任务,而挑战在于选择匹配方法以预测结果以及决定是否使用新预测的结果来更新目标模板。但是,现有的方法以粗糙而僵化的方式使这些选择损害了它们的性能。为了克服这一限制,我们提出了一种新颖的方法,该方法利用强化学习同时做出这两个决定。具体而言,强化学习者学会了根据预测结果的质量来决定是否要更新目标模板。匹配方法的选择将根据强化学习代理的动作历史同时确定。实验表明,我们的方法比以前的最新方法快10倍,其准确性甚至更高(在Davis 2017数据集中的区域相似性为69.1%)。
In this paper, the main task we aim to tackle is the multi-instance semi-supervised video object segmentation across a sequence of frames where only the first-frame box-level ground-truth is provided. Detection-based algorithms are widely adopted to handle this task, and the challenges lie in the selection of the matching method to predict the result as well as to decide whether to update the target template using the newly predicted result. The existing methods, however, make these selections in a rough and inflexible way, compromising their performance. To overcome this limitation, we propose a novel approach which utilizes reinforcement learning to make these two decisions at the same time. Specifically, the reinforcement learning agent learns to decide whether to update the target template according to the quality of the predicted result. The choice of the matching method will be determined at the same time, based on the action history of the reinforcement learning agent. Experiments show that our method is almost 10 times faster than the previous state-of-the-art method with even higher accuracy (region similarity of 69.1% on DAVIS 2017 dataset).