论文标题
在自然语言产生中交易多样性和质量
Trading Off Diversity and Quality in Natural Language Generation
论文作者
论文摘要
对于诸如讲故事和对话之类的开放式语言生成任务,选择正确的解码算法对于控制发电质量和多样性之间的权衡至关重要。但是,目前尚无对最好的解码程序甚至可以比较它们的标准的共识。我们通过将解码作为一个多目标优化问题来解决这些问题,旨在同时提高响应质量和多样性。我们的框架使我们能够对整个质量多样性频谱进行第一个大规模评估。我们发现,当多样性是优先级时,所有方法的性能类似,但是当质量被认为更重要时,最近提出的核采样(Holtzman等人,2019年)优于所有其他评估的解码算法。我们的实验还证实了“似然陷阱”的存在,这是违反直觉的观察,即高似然序列通常是令人惊讶的低质量质量。我们利用我们的发现来创建和评估一种称为\ emph {选择性采样}的算法,该算法可以易于近似全球归一化的温度采样。
For open-ended language generation tasks such as storytelling and dialogue, choosing the right decoding algorithm is critical to controlling the tradeoff between generation quality and diversity. However, there presently exists no consensus on which decoding procedure is best or even the criteria by which to compare them. We address these issues by casting decoding as a multi-objective optimization problem aiming to simultaneously maximize both response quality and diversity. Our framework enables us to perform the first large-scale evaluation of decoding methods along the entire quality-diversity spectrum. We find that when diversity is a priority, all methods perform similarly, but when quality is viewed as more important, the recently proposed nucleus sampling (Holtzman et al. 2019) outperforms all other evaluated decoding algorithms. Our experiments also confirm the existence of the `likelihood trap', the counter-intuitive observation that high likelihood sequences are often surprisingly low quality. We leverage our findings to create and evaluate an algorithm called \emph{selective sampling} which tractably approximates globally-normalized temperature sampling.