论文标题

一种增强的手段方法,用于生成酒店多审查摘要

An Enhanced MeanSum Method For Generating Hotel Multi-Review Summarizations

论文作者

Geng, Saibo, Antognini, Diego

论文摘要

多文件摘要是将多个文本作为输入并根据输入文本的内容产生简短的摘要文本的过程。直到最近,多文件摘要大多都是监督的提取物。但是,有监督的方法需要大型配对的文档 - 苏房示例的数据集,这些示例罕见且生产昂贵。在2018年,Chu和Liu提出了一种无监督的多文件抽象摘要方法(平均值),并证明了与提取方法相比的竞争性能。尽管对自动指标进行了良好的评估结果,但Meansum仍有多个局限性,尤其是无法处理多个方面的局限性。这项工作的目的是将多种遮罩器(MAM)用作内容选择器,以多样性地解决该问题。此外,我们提出了一个正规器来控制生成的摘要的长度。通过Trip Advisor的酒店数据集上的一系列实验,我们验证了我们的假设,并表明我们改进的模型比原始的含义方法具有更高的胭脂,情感准确性,并且还可以击败/可以击败/构成/接近受监督的基线。

Multi-document summaritazion is the process of taking multiple texts as input and producing a short summary text based on the content of input texts. Up until recently, multi-document summarizers are mostly supervised extractive. However, supervised methods require datasets of large, paired document-summary examples which are rare and expensive to produce. In 2018, an unsupervised multi-document abstractive summarization method(Meansum) was proposed by Chu and Liu, and demonstrated competitive performances comparing to extractive methods. Despite good evaluation results on automatic metrics, Meansum has multiple limitations, notably the inability of dealing with multiple aspects. The aim of this work was to use Multi-Aspect Masker(MAM) as content selector to address the issue with multi-aspect. Moreover, we propose a regularizer to control the length of the generated summaries. Through a series of experiments on the hotel dataset from Trip Advisor, we validate our assumption and show that our improved model achieves higher ROUGE, Sentiment Accuracy than the original Meansum method and also beats/ comprarable/close to the supervised baseline.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源