论文标题
[re] BADDER种子:再现词汇方法的评估以进行偏差测量
[Re] Badder Seeds: Reproducing the Evaluation of Lexical Methods for Bias Measurement
论文作者
论文摘要
在NLP中打击偏见需要偏差测量。几乎总是通过使用种子术语的词典,即指定刻板印象或感兴趣的尺寸的单词来实现偏差测量。这项可重复性研究的重点是原始作者的主要主张,即在使用之前需要彻底检查这些词典的理由,因为用于偏见测量的种子本身会表现出偏见。该研究旨在评估论文中列出的定量和定性结果的可重复性及其得出的结论。我们重现了支持原始作者一般主张的大多数结果:种子集经常遭受偏见,这些偏见会影响其作为偏见指标基线的偏见。通常,我们的结果反映了原始纸张。它们在某些场合略有不同,但并没有破坏本文的总体意图,以显示种子集的脆弱性。
Combating bias in NLP requires bias measurement. Bias measurement is almost always achieved by using lexicons of seed terms, i.e. sets of words specifying stereotypes or dimensions of interest. This reproducibility study focuses on the original authors' main claim that the rationale for the construction of these lexicons needs thorough checking before usage, as the seeds used for bias measurement can themselves exhibit biases. The study aims to evaluate the reproducibility of the quantitative and qualitative results presented in the paper and the conclusions drawn thereof. We reproduce most of the results supporting the original authors' general claim: seed sets often suffer from biases that affect their performance as a baseline for bias metrics. Generally, our results mirror the original paper's. They are slightly different on select occasions, but not in ways that undermine the paper's general intent to show the fragility of seed sets.