通过机器学习对大规模视频语料库的速率失真优化

论文标题

通过机器学习对大规模视频语料库的速率失真优化

Rate distortion optimization over large scale video corpus with machine learning

论文作者

John, Sam, Gadde, Akshay, Adsumilli, Balu

论文摘要

我们提出了一种有效的编解码方法，用于在大规模视频语料库上进行比特率分配，其目的是最大程度地减少比特率的平均约束和最低质量。我们的方法将视频集中在语料库中，以使一个群集中的视频具有相似的速率 - 延伸（R-D）特征。我们使用简单的视频复杂性功能来训练支持向量机分类器，以预测视频的R-D群集，这些功能在计算上易于获得。该模型允许我们对大量语料库进行分类，以估算每个集群中视频数量的分布。我们使用此分布来找到每个R-D群集的最佳编码器工作点。使用AV1编码器的实验表明，我们的方法可以以$ 22 \％$ $少的平均比特率达到相同的平均质量。

We present an efficient codec-agnostic method for bitrate allocation over a large scale video corpus with the goal of minimizing the average bitrate subject to constraints on average and minimum quality. Our method clusters the videos in the corpus such that videos within one cluster have similar rate-distortion (R-D) characteristics. We train a support vector machine classifier to predict the R-D cluster of a video using simple video complexity features that are computationally easy to obtain. The model allows us to classify a large sample of the corpus in order to estimate the distribution of the number of videos in each of the clusters. We use this distribution to find the optimal encoder operating point for each R-D cluster. Experiments with AV1 encoder show that our method can achieve the same average quality over the corpus with $22\%$ less average bitrate.

下载PDF全文

下载文献需遵守相关版权规定

论文标题