论文标题

SmallBignet:集成视频分类的核心和上下文视图

SmallBigNet: Integrating Core and Contextual Views for Video Classification

论文作者

Li, Xianhang, Wang, Yali, Zhou, Zhipeng, Qiao, Yu

论文摘要

时间卷积已被广泛用于视频分类。但是,它是在有限的视图中在时空上下文上执行的,这通常会削弱其学习视频表示的能力。为了减轻这个问题,我们提出了一个简洁而新颖的小型网络,并与大小的景色合作。在当前时间步骤中,小型视图分支用于学习核心语义,而大视图分支则用于捕获上下文语义。与传统的时间卷积不同,Big View分支可以为Small View分支提供最广泛的3D接收场的激活视频功能。通过汇总此类大视图上下文,小型视图分支可以学习更强大和歧视性的时空表示,以进行视频分类。此外,我们建议在小型和大视野分支中共享卷积,从而改善模型紧凑性并减轻过度拟合。结果,我们的SmallBignet实现了类似2D CNN(例如2D CNN)的可比模型大小,同时提高了像3D CNN这样的精度。我们在大规模视频基准测试中进行了广泛的实验,例如Kinetics400,Shoththing V1和V2。在准确性和/或效率方面,我们的Smallbig网络的表现优于许多最新的方法。代码和模型将在https://github.com/xhl-video/smallbignet上提供。

Temporal convolution has been widely used for video classification. However, it is performed on spatio-temporal contexts in a limited view, which often weakens its capacity of learning video representation. To alleviate this problem, we propose a concise and novel SmallBig network, with the cooperation of small and big views. For the current time step, the small view branch is used to learn the core semantics, while the big view branch is used to capture the contextual semantics. Unlike traditional temporal convolution, the big view branch can provide the small view branch with the most activated video features from a broader 3D receptive field. Via aggregating such big-view contexts, the small view branch can learn more robust and discriminative spatio-temporal representations for video classification. Furthermore, we propose to share convolution in the small and big view branch, which improves model compactness as well as alleviates overfitting. As a result, our SmallBigNet achieves a comparable model size like 2D CNNs, while boosting accuracy like 3D CNNs. We conduct extensive experiments on the large-scale video benchmarks, e.g., Kinetics400, Something-Something V1 and V2. Our SmallBig network outperforms a number of recent state-of-the-art approaches, in terms of accuracy and/or efficiency. The codes and models will be available on https://github.com/xhl-video/SmallBigNet.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源