用于小组活动识别的演员转变器

论文标题

用于小组活动识别的演员转变器

Actor-Transformers for Group Activity Recognition

论文作者

Gavrilyuk, Kirill, Sanford, Ryan, Javan, Mehrsan, Snoek, Cees G. M.

论文摘要

本文努力认识到视频中的个人行动和小组活动。尽管针对这个挑战性问题的现有解决方案根据各个参与者的位置明确模拟了空间和时间关系，但我们提出了一个能够学习并有选择地提取与小组活动识别相关的信息的参与者转换器模型。我们分别以2D姿势网络和3D CNN的特征表达的富参与者特异性静态和动态表示形式供应变压器。我们从经验上研究了结合这些表示形式并表现出其互补益处的不同方法。实验表明了什么重要的转换以及应如何转换。此外，Actor-Transformers在两个公开可用的基准中获得了最新的结果，用于小组活动识别，表现优于先前的最佳发布结果。

This paper strives to recognize individual actions and group activities from videos. While existing solutions for this challenging problem explicitly model spatial and temporal relationships based on location of individual actors, we propose an actor-transformer model able to learn and selectively extract information relevant for group activity recognition. We feed the transformer with rich actor-specific static and dynamic representations expressed by features from a 2D pose network and 3D CNN, respectively. We empirically study different ways to combine these representations and show their complementary benefits. Experiments show what is important to transform and how it should be transformed. What is more, actor-transformers achieve state-of-the-art results on two publicly available benchmarks for group activity recognition, outperforming the previous best published results by a considerable margin.

下载PDF全文

下载文献需遵守相关版权规定

论文标题