论文标题
学会的视频压缩与功能级残差
Learned Video Compression with Feature-level Residuals
论文作者
论文摘要
在本文中,我们提出了一个针对CLIC的P框架挑战的端到端视频压缩网络。我们专注于基于深度神经网络(DNN)的视频压缩,并从三个方面改善当前框架。首先,我们注意到像素空间残差对基于光流动运动补偿的预测误差敏感。为了抑制相对影响,我们建议压缩图像特征的残差,而不是图像像素的残差。此外,我们通过模型结合了像素级和特征级残留压缩方法的优势。最后,我们提出了一个分步培训策略,以提高整个框架的训练效率。实验结果表明,我们提出的方法在CLIC验证集上达到0.9968 ms-SSIM,在测试集上达到0.9967 ms-SSIM。
In this paper, we present an end-to-end video compression network for P-frame challenge on CLIC. We focus on deep neural network (DNN) based video compression, and improve the current frameworks from three aspects. First, we notice that pixel space residuals is sensitive to the prediction errors of optical flow based motion compensation. To suppress the relative influence, we propose to compress the residuals of image feature rather than the residuals of image pixels. Furthermore, we combine the advantages of both pixel-level and feature-level residual compression methods by model ensembling. Finally, we propose a step-by-step training strategy to improve the training efficiency of the whole framework. Experiment results indicate that our proposed method achieves 0.9968 MS-SSIM on CLIC validation set and 0.9967 MS-SSIM on test set.