论文标题

质量控制的解释

Quality Controlled Paraphrase Generation

论文作者

Bandel, Elron, Aharonov, Ranit, Shmueli-Scheuer, Michal, Shnayderman, Ilya, Slonim, Noam, Ein-Dor, Liat

论文摘要

释义生成已被广泛用于各种下游任务。大多数任务主要从高质量的释义中受益,即与原始句子相似但语言上不同的任务。产生高质量的解释是具有挑战性的,因为随着语言多样性的增加,它变得越来越难以保留意义。最近的作品通过控制释义的特定方面(例如其句法树)实现了不错的结果。但是,他们不允许直接控制生成的释义的质量,并且柔韧性和可扩展性低。在这里,我们提出了$ QCPG $,这是一种质量引导的受控释义生成模型,可以直接控制质量尺寸。此外,我们建议一种给定句子的方法,可以确定质量控制空间中的点,该点有望产生最佳的产生的释义。我们表明,我们的方法能够生成具有原始含义的释义,同时获得比不受控制的基线更高的多样性。模型,代码和数据可以在https://github.com/ibm/quality-controlled-paraphrase-generation中找到。

Paraphrase generation has been widely used in various downstream tasks. Most tasks benefit mainly from high quality paraphrases, namely those that are semantically similar to, yet linguistically diverse from, the original sentence. Generating high-quality paraphrases is challenging as it becomes increasingly hard to preserve meaning as linguistic diversity increases. Recent works achieve nice results by controlling specific aspects of the paraphrase, such as its syntactic tree. However, they do not allow to directly control the quality of the generated paraphrase, and suffer from low flexibility and scalability. Here we propose $QCPG$, a quality-guided controlled paraphrase generation model, that allows directly controlling the quality dimensions. Furthermore, we suggest a method that given a sentence, identifies points in the quality control space that are expected to yield optimal generated paraphrases. We show that our method is able to generate paraphrases which maintain the original meaning while achieving higher diversity than the uncontrolled baseline. The models, the code, and the data can be found in https://github.com/IBM/quality-controlled-paraphrase-generation.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源