通过利用分层冗余来增强神经视频编解码器

论文标题

通过利用分层冗余来增强神经视频编解码器

Boosting neural video codecs by exploiting hierarchical redundancy

论文作者

Pourreza, Reza, Le, Hoang, Said, Amir, Sautiere, Guillaume, Wiggers, Auke

论文摘要

在视频压缩中，通过运动和剩余补偿从先前解码的帧重复使用像素来提高编码效率。我们在视频帧中定义了两个级别的分层冗余级别：1）一阶：像素空间中的冗余，即，在相邻帧之间的像素值的相似性相似，这是通过运动和残留补偿有效地捕获的，2）第二阶：由于自然视频中的平稳运动而导致的运动和剩余图片的冗余。尽管大多数现有的神经视频编码文献都涉及一阶冗余，但我们解决了通过预测变量捕获神经视频编解码器中二阶冗余的问题。我们引入了通用运动和残留预测因子，这些预测因素学会从先前解码的数据中推断出来。这些预测因子很轻，并且可以用于大多数神经视频编解码器，以提高其利率延伸性能。此外，虽然RGB是神经视频编码文献中的主要色彩空间，但我们引入了神经视频编解码器的一般修改，以包含YUV420 Colorspace并报告YUV420结果。我们的实验表明，使用众所周知的神经视频编解码器使用我们的预测变量可在UVG数据集中测得的RGB和YUV420 Colorspace中节省38％和34％的比特率。

In video compression, coding efficiency is improved by reusing pixels from previously decoded frames via motion and residual compensation. We define two levels of hierarchical redundancy in video frames: 1) first-order: redundancy in pixel space, i.e., similarities in pixel values across neighboring frames, which is effectively captured using motion and residual compensation, 2) second-order: redundancy in motion and residual maps due to smooth motion in natural videos. While most of the existing neural video coding literature addresses first-order redundancy, we tackle the problem of capturing second-order redundancy in neural video codecs via predictors. We introduce generic motion and residual predictors that learn to extrapolate from previously decoded data. These predictors are lightweight, and can be employed with most neural video codecs in order to improve their rate-distortion performance. Moreover, while RGB is the dominant colorspace in neural video coding literature, we introduce general modifications for neural video codecs to embrace the YUV420 colorspace and report YUV420 results. Our experiments show that using our predictors with a well-known neural video codec leads to 38% and 34% bitrate savings in RGB and YUV420 colorspaces measured on the UVG dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题