论文标题
使用概率自动机推断动作的时间组成
Inferring Temporal Compositions of Actions Using Probabilistic Automata
论文作者
论文摘要
本文提出了一个框架,以识别视频中原子动作的时间组成。具体而言,我们建议将动作的时间组成表示为语义正则表达式,并使用概率自动机来得出推理框架,以识别复杂的动作,因为在输入视频特征上满足了这些表达式。我们的方法不同于现有的作品,这些作品要么将长期复杂活动视为无序的原子动作集,要么使用自然语言句子检索视频。取而代之的是,所提出的方法允许仅使用预审前的动作分类器识别复杂的细粒活动,而无需任何其他数据,注释或神经网络培训。为了评估我们的方法的潜力,我们提供了有关合成数据集的实验,并挑战了真实的动作识别数据集,例如多源和charades。我们得出的结论是,所提出的方法可以将最先进的原始动作分类器扩展到更复杂的活动,而不会大量绩效退化。
This paper presents a framework to recognize temporal compositions of atomic actions in videos. Specifically, we propose to express temporal compositions of actions as semantic regular expressions and derive an inference framework using probabilistic automata to recognize complex actions as satisfying these expressions on the input video features. Our approach is different from existing works that either predict long-range complex activities as unordered sets of atomic actions, or retrieve videos using natural language sentences. Instead, the proposed approach allows recognizing complex fine-grained activities using only pretrained action classifiers, without requiring any additional data, annotations or neural network training. To evaluate the potential of our approach, we provide experiments on synthetic datasets and challenging real action recognition datasets, such as MultiTHUMOS and Charades. We conclude that the proposed approach can extend state-of-the-art primitive action classifiers to vastly more complex activities without large performance degradation.