短期和长期背景聚合网络，用于视频介绍

论文标题

短期和长期背景聚合网络，用于视频介绍

Short-Term and Long-Term Context Aggregation Network for Video Inpainting

论文作者

Li, Ang, Zhao, Shanshan, Ma, Xingjun, Gong, Mingming, Qi, Jianzhong, Zhang, Rui, Tao, Dacheng, Kotagiri, Ramamohanarao

论文摘要

视频介绍旨在恢复视频缺失区域，并具有许多应用程序，例如视频编辑和对象删除。但是，现有方法要么遭受短期背景聚合的不准确，要么很少探索长期框架信息。在这项工作中，我们提出了一个新颖的上下文聚合网络，以有效利用短期和长期框架信息来用于视频介绍。在编码阶段，我们提出了边界感知的短期上下文聚合，从邻居框架（与丢失区域的边界上下文中的边界环境紧密相关的本地区域，它们都与目标框架的边界上下文密切相关。此外，我们建议使用长期框架特征在编码阶段中生成的动态长期上下文聚合，以完善全球的特征映射，这些特征在整个介入过程中都会动态更新。实验表明，它的表现优于最先进的方法，其成果更好和快速介入速度。

Video inpainting aims to restore missing regions of a video and has many applications such as video editing and object removal. However, existing methods either suffer from inaccurate short-term context aggregation or rarely explore long-term frame information. In this work, we present a novel context aggregation network to effectively exploit both short-term and long-term frame information for video inpainting. In the encoding stage, we propose boundary-aware short-term context aggregation, which aligns and aggregates, from neighbor frames, local regions that are closely related to the boundary context of missing regions into the target frame. Furthermore, we propose dynamic long-term context aggregation to globally refine the feature map generated in the encoding stage using long-term frame features, which are dynamically updated throughout the inpainting process. Experiments show that it outperforms state-of-the-art methods with better inpainting results and fast inpainting speed.

下载PDF全文

下载文献需遵守相关版权规定

论文标题