论文标题
重新访问时间动作定位的锚固机制
Revisiting Anchor Mechanisms for Temporal Action Localization
论文作者
论文摘要
当前的大多数动作定位方法遵循基于锚的管道:通过预定义的锚来描述动作实例,学习选择最接近地面真理的锚点,并通过改进来预测锚点的信心。预定义的锚定对行动实例的位置和持续时间进行了先验,这有助于常见行动实例的本地化,但限制了以急剧的品种来解决动作实例的灵活性,尤其是对于极短或极度长的。为了解决这个问题,本文提出了一个新颖的无锚固动作定位模块,该模块可以通过时间点来帮助动作定位。具体而言,该模块表示一个动作实例是其与起始边界和结束边界距离的点,从而减轻了在动作定位和持续时间方面的预定固定锚限制。所提出的无锚模块能够预测持续时间非常短或非常长的动作实例。通过将提出的无锚固模块与常规锚固模块相结合,我们提出了一个新型的动作定位框架,称为A2NET。无锚和基于锚的模块之间的合作在Thumos14上的最新表现卓越(45.5%对42.8%)。此外,全面的实验证明了无锚和基于锚的模块之间的互补性,使A2NET简单但有效。
Most of the current action localization methods follow an anchor-based pipeline: depicting action instances by pre-defined anchors, learning to select the anchors closest to the ground truth, and predicting the confidence of anchors with refinements. Pre-defined anchors set prior about the location and duration for action instances, which facilitates the localization for common action instances but limits the flexibility for tackling action instances with drastic varieties, especially for extremely short or extremely long ones. To address this problem, this paper proposes a novel anchor-free action localization module that assists action localization by temporal points. Specifically, this module represents an action instance as a point with its distances to the starting boundary and ending boundary, alleviating the pre-defined anchor restrictions in terms of action localization and duration. The proposed anchor-free module is capable of predicting the action instances whose duration is either extremely short or extremely long. By combining the proposed anchor-free module with a conventional anchor-based module, we propose a novel action localization framework, called A2Net. The cooperation between anchor-free and anchor-based modules achieves superior performance to the state-of-the-art on THUMOS14 (45.5% vs. 42.8%). Furthermore, comprehensive experiments demonstrate the complementarity between the anchor-free and the anchor-based module, making A2Net simple but effective.