论文标题
基于Foveation的深度视频压缩而无需运动搜索
Foveation-based Deep Video Compression without Motion Search
论文作者
论文摘要
更大的文件大小,不同的存储格式以及VR的沉浸式查看条件的要求对获取,传输,压缩和显示高质量VR内容的目标构成了重大挑战。同时,深入学习促进视频压缩问题进展的巨大潜力促进了重大的研究工作。由于VR的带宽要求很高,因此对使用空间变化的,foveat的压缩方案的使用也引起了重大兴趣。我们已经集成了这些技术,以创建一个端到端的深度学习视频压缩框架。我们新的压缩模型的一个功能是,它消除了对基于昂贵的基于搜索的运动预测计算的需求。这是通过利用由位移框架差异表示的视频运动中固有的统计规律来实现的。人们希望使用FoveAtion协议,因为在VR中观看的视频的一小部分可以作为用户朝任何给定方向凝视。此外,即使在当前的视野(FOV)中,视网膜神经元的分辨率也随着凝视点的距离(偏心率)迅速降低。在基于学习的方法中,我们通过引入FoveAtion Generator单元(FGU)来实现FoveAtion,该发电机单元(FGU)产生了指导位分配的foveation口罩,从而大大提高了压缩效率,同时又可以使几乎没有其他视觉损失的印象保持在适当的视觉损失的情况下。我们的实验结果表明,我们称之为foveat lentleless视频编解码器(FOVEATED MOVI-CODEC)的新压缩模型能够在不计算运动的情况下有效地压缩视频,同时在UVG数据集和HEVC class b test sequares becters by the Hevg DataSet上均超过了H.264和H.265。
The requirements of much larger file sizes, different storage formats, and immersive viewing conditions of VR pose significant challenges to the goals of acquiring, transmitting, compressing, and displaying high-quality VR content. At the same time, the great potential of deep learning to advance progress on the video compression problem has driven a significant research effort. Because of the high bandwidth requirements of VR, there has also been significant interest in the use of space-variant, foveated compression protocols. We have integrated these techniques to create an end-to-end deep learning video compression framework. A feature of our new compression model is that it dispenses with the need for expensive search-based motion prediction computations. This is accomplished by exploiting statistical regularities inherent in video motion expressed by displaced frame differences. Foveation protocols are desirable since only a small portion of a video viewed in VR may be visible as a user gazes in any given direction. Moreover, even within a current field of view (FOV), the resolution of retinal neurons rapidly decreases with distance (eccentricity) from the projected point of gaze. In our learning based approach, we implement foveation by introducing a Foveation Generator Unit (FGU) that generates foveation masks which direct the allocation of bits, significantly increasing compression efficiency while making it possible to retain an impression of little to no additional visual loss given an appropriate viewing geometry. Our experiment results reveal that our new compression model, which we call the Foveated MOtionless VIdeo Codec (Foveated MOVI-Codec), is able to efficiently compress videos without computing motion, while outperforming foveated version of both H.264 and H.265 on the widely used UVG dataset and on the HEVC Standard Class B Test Sequences.