论文标题
IFDID:信息过滤多样性改良的解码,以实现多元化信仰的权衡
IFDID: Information Filter upon Diversity-Improved Decoding for Diversity-Faithfulness Tradeoff in NLG
论文作者
论文摘要
一些自然语言生成(NLG)任务既需要忠诚和多样性。解码策略与生成文本的质量密切相关。诸如光束搜索,贪婪搜索等策略以低多样性和高度重复的策略。另一方面,指导的解码,解决多样性的解决方案可能会产生不忠的表达。为此,本文介绍了通过多样性的解码(IFDID)进行信息过滤,以获得多样性和忠诚之间的权衡。 IFDID是一种两阶段的解码策略,利用拟议的增强过滤器框架,通过增加选择的某些典型令牌的概率,并通过其信息量来过滤它们,从而实现了权衡。为了验证有效性,我们将我们的方法与其他基准相关的基准,涵盖中文和英语数据集的相关公共,Rocstories和Adgen基准测试。我们的数值实验结果和人类评估结果验证了所提出的方法的有效性,因为我们的方法达到了1.24的胭脂评分,描述了忠诚度和更高的多样性,而多样性则比传统方法高62.5%,这表明IFDID是一种新颖的SOTA解码策略,用于在多样性和忠诚之间进行交配。
Some Natural Language Generation (NLG) tasks require both faithfulness and diversity. The decoding strategy is intensively related to the quality of the generated text. Strategies such as beam search, greedy search, etc., perform with low diversity and high repetition. On the other hand, guided decoding, the solution towards diversity, may generate unfaithful expressions. To this end, this paper presents Information Filter upon Diversity-Improved Decoding (IFDID) to obtain the tradeoff between diversity and faithfulness. IFDID is a two-stage decoding strategy leveraging the proposed Enhance-Filter framework, which achieves the tradeoff by increasing the probabilities of some typical tokens being selected and subsequently filtering them by their information amount. To verify the effectiveness, we compare our method with other baselines on related CommonGEN, RocStories and AdGen benchmarks, which cover Chinese and English datasets. Our numerical experimental results and human evaluation outcomes verify the effectiveness of the proposed approach, as our approach achieves a 1.24 higher ROUGE score describing faithfulness as well as higher diversity represented by 62.5% higher upon Dist-2 than traditional approaches, demonstrating that IFDID is a novel SOTA decoding strategy for the tradeoff between diversity and faithfulness.