论文标题
结合域知识以改善长长MOOC讲座视频的主题细分
Incorporating Domain Knowledge To Improve Topic Segmentation Of Long MOOC Lecture Videos
论文作者
论文摘要
主题细分在减少介绍视频中教授的主题的搜索空间中发挥了重要作用,特别是当视频元数据缺乏主题细分信息时。此细分信息减少了用户在演讲视频中搜索,定位和浏览主题的工作。在这项工作中,我们提出了一种算法,该算法结合了最先进的语言模型和域知识图,以自动检测长讲座视频中存在的不同相干主题。我们在语音到文本转录上使用语言模型来捕获整个视频的隐含含义,而知识图为我们提供了该主题不同概念之间的域特定依赖性。同样,利用领域知识,我们可以捕获讲师在教学时绑定和连接不同概念的方式,这有助于我们实现更好的细分精度。我们在NPTEL演讲视频和整体评估上测试了我们的方法,表明它执行了文献中描述的其他方法。
Topical Segmentation poses a great role in reducing search space of the topics taught in a lecture video specially when the video metadata lacks topic wise segmentation information. This segmentation information eases user efforts of searching, locating and browsing a topic inside a lecture video. In this work we propose an algorithm, that combines state-of-the art language model and domain knowledge graph for automatically detecting different coherent topics present inside a long lecture video. We use the language model on speech-to-text transcription to capture the implicit meaning of the whole video while the knowledge graph provides us the domain specific dependencies between different concepts of that subjects. Also leveraging the domain knowledge we can capture the way instructor binds and connects different concepts while teaching, which helps us in achieving better segmentation accuracy. We tested our approach on NPTEL lecture videos and holistic evaluation shows that it out performs the other methods described in the literature.