论文标题

NAVIDAD:基于深度自动编码器的无参考音频质量度量

NAViDAd: A No-Reference Audio-Visual Quality Metric Based on a Deep Autoencoder

论文作者

Martinez, Helard, Farias, M. C., Hines, A.

论文摘要

开发用于音频和视频信号质量预测的模型是一个相当成熟的领域。但是,尽管已经提出了几种多模型模型,但视听质量预测的领域仍然是新兴领域。实际上,尽管通过组合和参数指标获得了合理的性能,但目前尚无基于可靠的像素的视听质量指标。这项工作中介绍的方法基于以下假设:自动编码器以描述性音频和视频功能喂养,可能会产生一组能够描述复杂音频和视频交互的功能。基于这一假设,我们提出了基于深度自动编码器(NAVIDAD)的无参考音频质量度量。模型视觉特征是视频组件的自然场景统计(NSS)和时空测量。同时,通过计算音频组件的频谱图表示获得音频功能。该模型由一个2层框架形成,其中包括深度自动编码层和分类层。这两个层是堆叠和训练以建立深神经网络模型的。使用大量刺激训练和测试该模型,其中包含代表性的音频和视频伪影。当针对UNB-AV和Livenetflix-II数据库进行测试时,该模型表现良好。 %的结果表明,这种方法会产生与主观质量评分高度相关的质量评分。

The development of models for quality prediction of both audio and video signals is a fairly mature field. But, although several multimodal models have been proposed, the area of audio-visual quality prediction is still an emerging area. In fact, despite the reasonable performance obtained by combination and parametric metrics, currently there is no reliable pixel-based audio-visual quality metric. The approach presented in this work is based on the assumption that autoencoders, fed with descriptive audio and video features, might produce a set of features that is able to describe the complex audio and video interactions. Based on this hypothesis, we propose a No-Reference Audio-Visual Quality Metric Based on a Deep Autoencoder (NAViDAd). The model visual features are natural scene statistics (NSS) and spatial-temporal measures of the video component. Meanwhile, the audio features are obtained by computing the spectrogram representation of the audio component. The model is formed by a 2-layer framework that includes a deep autoencoder layer and a classification layer. These two layers are stacked and trained to build the deep neural network model. The model is trained and tested using a large set of stimuli, containing representative audio and video artifacts. The model performed well when tested against the UnB-AV and the LiveNetflix-II databases. %Results shows that this type of approach produces quality scores that are highly correlated to subjective quality scores.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源