通过深入学习通过前瞻性引用的一种新的视频压缩方式

论文标题

通过深入学习通过前瞻性引用的一种新的视频压缩方式

A new way of video compression via forward-referencing using deep learning

论文作者

Rajin, S. M. A. K., Murshed, M., Paul, M., Teng, S. W., Ma, J.

论文摘要

为了利用同一场景的视频框架中的高时间相关性，使用基于块的运动估计和补偿技术从已经编码的参考帧中预测了当前帧。尽管这种方法可以有效利用移动对象的翻译运动，但它容易受到其他类型的仿射运动和对象遮挡/去斑点的影响。最近，深度学习已被用来模拟人类姿势的高级姿势在短视频中的特定动作中，然后通过使用生成的对抗性网络（GAN）预测姿势来在将来生成虚拟框架。因此，建模人姿势的高级结构能够通过预测人类的行为并确定其轨迹来利用语义相关性。视频监视应用程序将受益，因为可以通过估计人类姿势轨迹并通过语义相关性产生未来的框架来压缩存储的大量监视数据。本文通过从已经编码的框架中对人姿势进行建模，并在当前使用生成的框架来探讨一种新的视频编码方式。预计所提出的方法可以通过预测包含具有较低残留物的移动对象的块来克服传统向后引用框架的局限性。实验结果表明，提出的方法平均可以实现高达2.83 dB PSNR增益和25.93 \％比特率的节省，用于高运动视频序列

To exploit high temporal correlations in video frames of the same scene, the current frame is predicted from the already-encoded reference frames using block-based motion estimation and compensation techniques. While this approach can efficiently exploit the translation motion of the moving objects, it is susceptible to other types of affine motion and object occlusion/deocclusion. Recently, deep learning has been used to model the high-level structure of human pose in specific actions from short videos and then generate virtual frames in future time by predicting the pose using a generative adversarial network (GAN). Therefore, modelling the high-level structure of human pose is able to exploit semantic correlation by predicting human actions and determining its trajectory. Video surveillance applications will benefit as stored big surveillance data can be compressed by estimating human pose trajectories and generating future frames through semantic correlation. This paper explores a new way of video coding by modelling human pose from the already-encoded frames and using the generated frame at the current time as an additional forward-referencing frame. It is expected that the proposed approach can overcome the limitations of the traditional backward-referencing frames by predicting the blocks containing the moving objects with lower residuals. Experimental results show that the proposed approach can achieve on average up to 2.83 dB PSNR gain and 25.93\% bitrate savings for high motion video sequences

下载PDF全文

下载文献需遵守相关版权规定

论文标题