论文标题
视觉跟踪的概率回归
Probabilistic Regression for Visual Tracking
论文作者
论文摘要
从根本上讲,视觉跟踪是在每个视频框架中回归目标状态的问题。尽管已经取得了重大进展,但跟踪器仍然容易出现故障和不准确性。因此,代表目标估计中的不确定性至关重要。尽管当前的突出范式依赖于估计国家依赖的置信度评分,但该值缺乏明确的概率解释,从而使其使用复杂化。 因此,在这项工作中,我们提出了一种概率回归公式,并将其应用于跟踪。我们的网络预测了给定输入图像的目标状态的条件概率密度。至关重要的是,我们的配方能够建模由任务中的注释和歧义不正确的标记噪声。通过最大程度地减少Kullback-Leibler Divergence来训练回归网络。当应用跟踪时,我们的公式不仅允许对输出的概率表示,而且还可以显着提高性能。我们的跟踪器在六个数据集上设置了一个新的最新技术,在LASOT上获得了59.8%的AUC,在TrackingNet上取得了75.8%的成功。代码和模型可在https://github.com/visionml/pytracking上找到。
Visual tracking is fundamentally the problem of regressing the state of the target in each video frame. While significant progress has been achieved, trackers are still prone to failures and inaccuracies. It is therefore crucial to represent the uncertainty in the target estimation. Although current prominent paradigms rely on estimating a state-dependent confidence score, this value lacks a clear probabilistic interpretation, complicating its use. In this work, we therefore propose a probabilistic regression formulation and apply it to tracking. Our network predicts the conditional probability density of the target state given an input image. Crucially, our formulation is capable of modeling label noise stemming from inaccurate annotations and ambiguities in the task. The regression network is trained by minimizing the Kullback-Leibler divergence. When applied for tracking, our formulation not only allows a probabilistic representation of the output, but also substantially improves the performance. Our tracker sets a new state-of-the-art on six datasets, achieving 59.8% AUC on LaSOT and 75.8% Success on TrackingNet. The code and models are available at https://github.com/visionml/pytracking.