基于树结构政策的渐进式增援学习，用于视频中的时间语言基础

论文标题

基于树结构政策的渐进式增援学习，用于视频中的时间语言基础

Tree-Structured Policy based Progressive Reinforcement Learning for Temporally Language Grounding in Video

论文作者

Wu, Jie, Li, Guanbin, Liu, Si, Lin, Liang

论文摘要

在未修剪视频中进行的时间语言接地是视频理解中新培训的一项任务。大多数现有方法都遭受较低的效率，缺乏可解释性和偏离人类感知机制的影响。受到人类粗略决策范式的启发，我们制定了一种新型的基于树的基于树结构的策略渐进式增强学习（TSP-PRL）框架，以通过迭代完善过程顺序调节时间边界。语义概念被明确表示为策略中的分支，这有助于将复杂的策略有效地分解为可解释的原始作用。渐进式强化学习通过两个面向任务的奖励提供了正确的信用分配，这些奖励鼓励在树结构政策中相互促进。我们对Charades-STA和ActivityNet数据集进行了广泛的评估TSP-PRL，实验结果表明，TSP-PRL在现有最新方法中实现了竞争性能。

Temporally language grounding in untrimmed videos is a newly-raised task in video understanding. Most of the existing methods suffer from inferior efficiency, lacking interpretability, and deviating from the human perception mechanism. Inspired by human's coarse-to-fine decision-making paradigm, we formulate a novel Tree-Structured Policy based Progressive Reinforcement Learning (TSP-PRL) framework to sequentially regulate the temporal boundary by an iterative refinement process. The semantic concepts are explicitly represented as the branches in the policy, which contributes to efficiently decomposing complex policies into an interpretable primitive action. Progressive reinforcement learning provides correct credit assignment via two task-oriented rewards that encourage mutual promotion within the tree-structured policy. We extensively evaluate TSP-PRL on the Charades-STA and ActivityNet datasets, and experimental results show that TSP-PRL achieves competitive performance over existing state-of-the-art methods.

下载PDF全文

下载文献需遵守相关版权规定

论文标题