论文标题
时间交织网络
Temporal Interlacing Network
论文作者
论文摘要
长期以来,视觉社区试图通过将卷积神经网络与各种时间模型相结合,例如马尔可夫链,光流,RNN和时间卷积等各种时间模型来学习时空表示。但是,由于空间和时间信息的交替学习过程,这些管道会消耗巨大的计算资源。一个自然的问题是,我们是否可以将时间信息嵌入到空间信息中,以便仅一次共同学习两个域中的信息。在这项工作中,我们通过提出一个简单而强大的操作员 - 时间交织网络(TIN)来回答这个问题。 Tin没有学习时间特征,而是通过从过去到未来的空间表示来融合了两种信息,反之亦然。可以学会一个可区分的交织目标来控制交织过程。这样,一个繁重的时间模型被简单的交织操作员取代。从理论上讲,我们证明,通过可学习的交织目标,TIN的性能等效于正规的时间卷积网络(R-TCN),但在6个具有挑战性的基准上的延迟降低了6倍,其准确性提高了4%。这些结果通过相当大的利润来推动视频理解的最新表现。毫不奇怪,提议的锡的合奏模型赢得了$ 1^{st} $放置在ICCV19-时空挑战中的多个时刻。代码可在https://github.com/deepcs233/tin上提供进一步的研究
For a long time, the vision community tries to learn the spatio-temporal representation by combining convolutional neural network together with various temporal models, such as the families of Markov chain, optical flow, RNN and temporal convolution. However, these pipelines consume enormous computing resources due to the alternately learning process for spatial and temporal information. One natural question is whether we can embed the temporal information into the spatial one so the information in the two domains can be jointly learned once-only. In this work, we answer this question by presenting a simple yet powerful operator -- temporal interlacing network (TIN). Instead of learning the temporal features, TIN fuses the two kinds of information by interlacing spatial representations from the past to the future, and vice versa. A differentiable interlacing target can be learned to control the interlacing process. In this way, a heavy temporal model is replaced by a simple interlacing operator. We theoretically prove that with a learnable interlacing target, TIN performs equivalently to the regularized temporal convolution network (r-TCN), but gains 4% more accuracy with 6x less latency on 6 challenging benchmarks. These results push the state-of-the-art performances of video understanding by a considerable margin. Not surprising, the ensemble model of the proposed TIN won the $1^{st}$ place in the ICCV19 - Multi Moments in Time challenge. Code is made available to facilitate further research at https://github.com/deepcs233/TIN