论文标题
DEEPQTMT:一种基于快速QTMT的CU分区的深度学习方法
DeepQTMT: A Deep Learning Approach for Fast QTMT-based CU Partition of Intra-mode VVC
论文作者
论文摘要
作为最新标准,多功能视频编码(VVC)大大提高了其祖先标准高效率视频编码(HEVC)的编码效率,但以急剧增加的复杂性为代价。在VVC中,由于残酷的搜索递归利率 - 持续性(RD)优化,编码单元(CU)分区的Quad-Tree Plus多类树(QTMT)结构占编码时间的97%以上。本文不是蛮力QTMT搜索,而是提出了一种深入学习方法来预测基于QTMT的CU分区,以极大地加速模式INTRA INTRA INTRA INTRA INTRA INTRA INTRAS INTRAS INTRAING PRECSICT。首先,我们建立了一个大规模数据库,该数据库包含具有不同视频内容的足够的CU分区模式,这可以促进数据驱动的VVC复杂性降低。接下来,我们提出了一个具有早期外观机制的多阶段出口CNN(MSE-CNN)模型,以确定CU分区,符合多个阶段的灵活QTMT结构。然后,我们为训练MSE-CNN模型设计一个自适应损耗函数,以最小化的RD成本综合了不确定的分割模式和目标。最后,制定了多个阈值决策计划,在复杂性和RD绩效之间取得了理想的权衡。实验结果表明,我们的方法可以将VVC的编码时间降低44.65%-66.88%,而可忽略不计的Bjøntegaard三角洲比特率(BD-BR)为1.322%-3.188%,这显着胜过其他尚未实现的尚未实现的方法。
Versatile Video Coding (VVC), as the latest standard, significantly improves the coding efficiency over its ancestor standard High Efficiency Video Coding (HEVC), but at the expense of sharply increased complexity. In VVC, the quad-tree plus multi-type tree (QTMT) structure of coding unit (CU) partition accounts for over 97% of the encoding time, due to the brute-force search for recursive rate-distortion (RD) optimization. Instead of the brute-force QTMT search, this paper proposes a deep learning approach to predict the QTMT-based CU partition, for drastically accelerating the encoding process of intra-mode VVC. First, we establish a large-scale database containing sufficient CU partition patterns with diverse video content, which can facilitate the data-driven VVC complexity reduction. Next, we propose a multi-stage exit CNN (MSE-CNN) model with an early-exit mechanism to determine the CU partition, in accord with the flexible QTMT structure at multiple stages. Then, we design an adaptive loss function for training the MSE-CNN model, synthesizing both the uncertain number of split modes and the target on minimized RD cost. Finally, a multi-threshold decision scheme is developed, achieving desirable trade-off between complexity and RD performance. Experimental results demonstrate that our approach can reduce the encoding time of VVC by 44.65%-66.88% with the negligible Bjøntegaard delta bit-rate (BD-BR) of 1.322%-3.188%, which significantly outperforms other state-of-the-art approaches.