分析解码算法对开放式语言生成公平性的影响

论文标题

分析解码算法对开放式语言生成公平性的影响

An Analysis of the Effects of Decoding Algorithms on Fairness in Open-Ended Language Generation

论文作者

Dhamala, Jwala, Kumar, Varun, Gupta, Rahul, Chang, Kai-Wei, Galstyan, Aram

论文摘要

几项先前的作品表明，语言模型（LMS）可以生成包含有害社会偏见和刻板印象的文本。解码算法在确定LM生成的文本的性质中起着核心作用，但尚未研究它们对世代的公平性的影响。我们对解码算法对LM公平性的影响进行系统分析，并分析公平，多样性和质量之间的权衡。我们对$ P $，上$ K $和温度解码算法的实验（开放式语言生成）表明，随着解码算法的超参数的变化，人口统计组的公平性发生了很大变化。值得注意的是，输出更多样化文本的解码算法还输出了更多具有负面情感和注意的文本。我们提出了几个发现，并提供有关公平评估中解码细节的标准化报告的建议，并优化了解码算法的公平性以及质量和多样性。

Several prior works have shown that language models (LMs) can generate text containing harmful social biases and stereotypes. While decoding algorithms play a central role in determining properties of LM generated text, their impact on the fairness of the generations has not been studied. We present a systematic analysis of the impact of decoding algorithms on LM fairness, and analyze the trade-off between fairness, diversity and quality. Our experiments with top-$p$, top-$k$ and temperature decoding algorithms, in open-ended language generation, show that fairness across demographic groups changes significantly with change in decoding algorithm's hyper-parameters. Notably, decoding algorithms that output more diverse text also output more texts with negative sentiment and regard. We present several findings and provide recommendations on standardized reporting of decoding details in fairness evaluations and optimization of decoding algorithms for fairness alongside quality and diversity.

下载PDF全文

下载文献需遵守相关版权规定

论文标题