通过开放式语言生成来衡量偏见的挑战

论文标题

通过开放式语言生成来衡量偏见的挑战

Challenges in Measuring Bias via Open-Ended Language Generation

论文作者

Akyürek, Afra Feyza, Kocyigit, Muhammed Yusuf, Paik, Sejin, Wijaya, Derry

论文摘要

研究人员已经设计了许多方法来量化鉴于经过验证的语言模型的社会偏见。由于某些语言模型能够在一组文本提示下产生连贯的完成，因此已经提出了一些提示数据集来衡量社交群体之间的偏见 - 使语言生成作为识别偏见的一种方式。在本意见论文中，我们分析了及时集，指标，自动工具和抽样策略的具体选择如何影响偏见结果。我们发现，通过文本完成来测量偏见的实践很容易在不同的实验设置下产生矛盾的结果。我们还提供建议在开放式语言生成中报告偏见，以使给定语言模型表现出更完整的偏见前景。复制结果的代码通过https://github.com/feyzaakyurek/bias-textgen发布。

Researchers have devised numerous ways to quantify social biases vested in pretrained language models. As some language models are capable of generating coherent completions given a set of textual prompts, several prompting datasets have been proposed to measure biases between social groups -- posing language generation as a way of identifying biases. In this opinion paper, we analyze how specific choices of prompt sets, metrics, automatic tools and sampling strategies affect bias results. We find out that the practice of measuring biases through text completion is prone to yielding contradicting results under different experiment settings. We additionally provide recommendations for reporting biases in open-ended language generation for a more complete outlook of biases exhibited by a given language model. Code to reproduce the results is released under https://github.com/feyzaakyurek/bias-textgen.

下载PDF全文

下载文献需遵守相关版权规定

论文标题