论文标题
关于歌曲和语音情感识别之间的差异:功能集,功能类型和分类器的影响
On The Differences Between Song and Speech Emotion Recognition: Effect of Feature Sets, Feature Types, and Classifiers
论文作者
论文摘要
在本文中,我们评估了歌曲和语音情感识别的不同功能集,功能类型和分类器。三个特征集:宝石,pyaudioanalysis和天秤座;两种特征类型:低级描述符和高级统计功能;和四个分类器:多层感知器,LSTM,GRU和卷积神经网络在具有相同参数值的歌曲和语音数据上检查。结果显示,使用相同的方法在歌曲和语音数据之间没有显着差异。此外,在此分类任务中,声学特征的高级统计功能比低级描述符获得了更高的性能得分。该结果加强了先前关于回归任务的发现,该发现报道了使用高级功能的优势。
In this paper, we evaluate the different features sets, feature types, and classifiers on both song and speech emotion recognition. Three feature sets: GeMAPS, pyAudioAnalysis, and LibROSA; two feature types: low-level descriptors and high-level statistical functions; and four classifiers: multilayer perceptron, LSTM, GRU, and convolution neural networks are examined on both song and speech data with the same parameter values. The results show no remarkable difference between song and speech data using the same method. In addition, high-level statistical functions of acoustic features gained higher performance scores than low-level descriptors in this classification task. This result strengthens the previous finding on the regression task which reported the advantage use of high-level features.