论文标题

播客抽象摘要的基线分析

A Baseline Analysis for Podcast Abstractive Summarization

论文作者

Zheng, Chujie, Wang, Harry Jiannan, Zhang, Kunpeng, Fan, Ling

论文摘要

播客摘要是影响最终用户聆听决策的重要因素,通常被认为是播客推荐系统以及许多下游应用程序中的关键功能。现有的抽象摘要方法主要建立在CNN和Dailymail News等专业编辑文本上的微调模型上。与新闻不同,播客通常更长,更容易说话和对话,并且对广告和赞助的内容吵了一下,这使自动播客摘要极具挑战性。本文使用TREC 2020提供的Spotify Podcast数据集进行了对播客摘要的基线分析。它旨在帮助研究人员了解当前最新的预培训模型,从而为创建更好的模型建立基础。

Podcast summary, an important factor affecting end-users' listening decisions, has often been considered a critical feature in podcast recommendation systems, as well as many downstream applications. Existing abstractive summarization approaches are mainly built on fine-tuned models on professionally edited texts such as CNN and DailyMail news. Different from news, podcasts are often longer, more colloquial and conversational, and noisier with contents on commercials and sponsorship, which makes automatic podcast summarization extremely challenging. This paper presents a baseline analysis of podcast summarization using the Spotify Podcast Dataset provided by TREC 2020. It aims to help researchers understand current state-of-the-art pre-trained models and hence build a foundation for creating better models.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源