量化和学习深度时空网络中的静态信息与动态信息

论文标题

量化和学习深度时空网络中的静态信息与动态信息

Quantifying and Learning Static vs. Dynamic Information in Deep Spatiotemporal Networks

论文作者

Kowal, Matthew, Siam, Mennatullah, Islam, Md Amirul, Bruce, Neil D. B., Wildes, Richard P., Derpanis, Konstantinos G.

论文摘要

对深层时空模型在中间表示中捕获的信息的了解有限。例如，尽管有证据表明，动作识别算法受到单帧视觉外观的严重影响，但与对动力学的偏见相比，没有定量方法来评估潜在表示中这种静态偏见。我们通过提出一种量化任何时空模型的静态和动态偏见的方法来应对这一挑战，并将我们的方法应用于三个任务，动作识别，自动视频对象细分（AVO）和视频实例段（VIS）。我们的主要发现是：（i）大多数检查的模型都偏向静态信息。（ii）一些假定偏向动态的数据集实际上偏向静态信息。（iii）体系结构中的单个通道可能会偏向两者的静态，动态或组合。（iv）大多数模型在训练的上半年会融合到其最终的偏见。然后，我们探索这些偏见如何影响动态偏置数据集的性能。为了采取行动识别，我们提出了静态的静止，这是一种语义引导的辍学，将模型从静态信息到动态的模型。对于AVO，与以前的体系结构相比，我们设计了融合和跨连接层的更好组合。

There is limited understanding of the information captured by deep spatiotemporal models in their intermediate representations. For example, while evidence suggests that action recognition algorithms are heavily influenced by visual appearance in single frames, no quantitative methodology exists for evaluating such static bias in the latent representation compared to bias toward dynamics. We tackle this challenge by proposing an approach for quantifying the static and dynamic biases of any spatiotemporal model, and apply our approach to three tasks, action recognition, automatic video object segmentation (AVOS) and video instance segmentation (VIS). Our key findings are: (i) Most examined models are biased toward static information. (ii) Some datasets that are assumed to be biased toward dynamics are actually biased toward static information. (iii) Individual channels in an architecture can be biased toward static, dynamic or a combination of the two. (iv) Most models converge to their culminating biases in the first half of training. We then explore how these biases affect performance on dynamically biased datasets. For action recognition, we propose StaticDropout, a semantically guided dropout that debiases a model from static information toward dynamics. For AVOS, we design a better combination of fusion and cross connection layers compared with previous architectures.

下载PDF全文

下载文献需遵守相关版权规定

论文标题