论文标题
从GPT-2评估语言生成中的话语关系
Assessing Discourse Relations in Language Generation from GPT-2
论文作者
论文摘要
NLP的最新进展归因于大规模预训练的语言模型的出现。尤其是GPT-2鉴于其左右语言建模目标,GPT-2适用于生成任务,但其生成文本的语言质量在很大程度上尚未得到探索。我们的工作迈出了一步,从话语连贯性方面理解GPT-2的输出。我们对有机发电和微调场景下的GPT-2产出中明确话语关系的有效性进行了全面研究。结果表明,GPT-2并不总是生成包含有效话语关系的文本;然而,在微调场景中,其文字与人类的期望更加一致。我们提出了一种脱钩的策略来减轻这些问题,并突出了明确建模话语信息的重要性。
Recent advances in NLP have been attributed to the emergence of large-scale pre-trained language models. GPT-2, in particular, is suited for generation tasks given its left-to-right language modeling objective, yet the linguistic quality of its generated text has largely remain unexplored. Our work takes a step in understanding GPT-2's outputs in terms of discourse coherence. We perform a comprehensive study on the validity of explicit discourse relations in GPT-2's outputs under both organic generation and fine-tuned scenarios. Results show GPT-2 does not always generate text containing valid discourse relations; nevertheless, its text is more aligned with human expectation in the fine-tuned scenario. We propose a decoupled strategy to mitigate these problems and highlight the importance of explicitly modeling discourse information.