论文标题
感知视频压缩的交互式注释工具
An Interactive Annotation Tool for Perceptual Video Compression
论文作者
论文摘要
人类的看法是有损视频压缩的核心,但是,收集足够致密的数据以驱动压缩是一项挑战。在感知质量评估中,人类反馈通常被收集为单个标量质量评分,表明一个扭曲的视频偏爱另一个视频。实际上,某些视频在某些部分可能会更好,但在另一些部分中不是更好。我们通过要求用户使用交互式工具直接优化感知质量的固定比特率来收集更细粒度的反馈。为此,我们构建了一个新颖的网络工具,允许用户在视频上绘制这些时空的重要性图。该工具允许进行交互式连续的完善:我们根据涂漆的重要性图迭代重新编码原始视频,同时保持相同的比特率,从而使用户可以视觉上看到将更高重要性分配给视频的一个时空部分的较高重要性。我们使用此工具在野外收集数据(10个视频,17个用户),并在X264编码的背景下利用获得的重要性图,以证明该工具确实可以用于生成视频,在同一比特率中,通过主观研究在同一比特率上看起来更好,并且是观众更可能比观众更高的1.9倍。可以在https://github.com/jenyap/video-annotation-tool.git上找到该工具和数据集的代码
Human perception is at the core of lossy video compression and yet, it is challenging to collect data that is sufficiently dense to drive compression. In perceptual quality assessment, human feedback is typically collected as a single scalar quality score indicating preference of one distorted video over another. In reality, some videos may be better in some parts but not in others. We propose an approach to collecting finer-grained feedback by asking users to use an interactive tool to directly optimize for perceptual quality given a fixed bitrate. To this end, we built a novel web-tool which allows users to paint these spatio-temporal importance maps over videos. The tool allows for interactive successive refinement: we iteratively re-encode the original video according to the painted importance maps, while maintaining the same bitrate, thus allowing the user to visually see the trade-off of assigning higher importance to one spatio-temporal part of the video at the cost of others. We use this tool to collect data in-the-wild (10 videos, 17 users) and utilize the obtained importance maps in the context of x264 coding to demonstrate that the tool can indeed be used to generate videos which, at the same bitrate, look perceptually better through a subjective study - and are 1.9 times more likely to be preferred by viewers. The code for the tool and dataset can be found at https://github.com/jenyap/video-annotation-tool.git