论文标题
研究自动提取的多模式特征和演讲视频质量的相关性
Investigating Correlations of Automatically Extracted Multimodal Features and Lecture Video Quality
论文作者
论文摘要
与用户查询相关性,通常会实现多媒体内容(例如视频)的排名和建议。但是,对于讲座的视频和MOOC(大型开放在线课程),不仅需要检索相关视频,而且尤其是找到高质量的讲座视频,以促进学习,例如独立于视频或演讲者的受欢迎程度。因此,关于讲座视频的质量的元数据是学习环境的关键特征,例如,在搜索中作为学习场景中的演讲视频建议。在本文中,我们研究了自动提取的功能是否与视频的质量方面相关。分析了有关音频,语言和视觉特征的大规模开放在线课程(MOOC)的一组学术视频。此外,提出了一组跨模式特征,这些特征是通过组合成绩单,音频,视频和幻灯片内容来得出的。进行了一项用户研究,以调查讲座视频的自动收集特征与人体评分之间的相关性。最后,讨论了我们的特征对参与者知识获得的影响。
Ranking and recommendation of multimedia content such as videos is usually realized with respect to the relevance to a user query. However, for lecture videos and MOOCs (Massive Open Online Courses) it is not only required to retrieve relevant videos, but particularly to find lecture videos of high quality that facilitate learning, for instance, independent of the video's or speaker's popularity. Thus, metadata about a lecture video's quality are crucial features for learning contexts, e.g., lecture video recommendation in search as learning scenarios. In this paper, we investigate whether automatically extracted features are correlated to quality aspects of a video. A set of scholarly videos from a Mass Open Online Course (MOOC) is analyzed regarding audio, linguistic, and visual features. Furthermore, a set of cross-modal features is proposed which are derived by combining transcripts, audio, video, and slide content. A user study is conducted to investigate the correlations between the automatically collected features and human ratings of quality aspects of a lecture video. Finally, the impact of our features on the knowledge gain of the participants is discussed.