以知识为中心的视图从知识的多文件科学摘要

论文标题

以知识为中心的视图从知识的多文件科学摘要

Multi-Document Scientific Summarization from a Knowledge Graph-Centric View

论文作者

Wang, Pancheng, Li, Shasha, Pang, Kunyuan, He, Liangliang, Li, Dong, Tang, Jintao, Wang, Ting

论文摘要

多文章的科学摘要（MDSS）旨在为与主题相关的科学论文群生成连贯而简洁的摘要。此任务需要精确理解纸张内容以及对交叉纸关系的准确建模。知识图为文档传达了紧凑且可解释的结构化信息，这使其非常适合内容建模和关系建模。在本文中，我们提出了KGSUM，这是一个MDSS模型，以编码过程和解码过程中的知识图为中心。具体而言，在编码过程中，提出了两个基于图的模块，以将知识图信息纳入纸张编码，而在解码过程中，我们通过以描述性句子的形式首先生成摘要的知识图，然后生成最终摘要，提出了一个两阶段解码器。经验结果表明，所提出的架构对多XSCIENCE数据集的基准进行了实质性改进。

Multi-Document Scientific Summarization (MDSS) aims to produce coherent and concise summaries for clusters of topic-relevant scientific papers. This task requires precise understanding of paper content and accurate modeling of cross-paper relationships. Knowledge graphs convey compact and interpretable structured information for documents, which makes them ideal for content modeling and relationship modeling. In this paper, we present KGSum, an MDSS model centred on knowledge graphs during both the encoding and decoding process. Specifically, in the encoding process, two graph-based modules are proposed to incorporate knowledge graph information into paper encoding, while in the decoding process, we propose a two-stage decoder by first generating knowledge graph information of summary in the form of descriptive sentences, followed by generating the final summary. Empirical results show that the proposed architecture brings substantial improvements over baselines on the Multi-Xscience dataset.

下载PDF全文

下载文献需遵守相关版权规定

论文标题