论文标题

部分可观测时空混沌系统的无模型预测

Counterfactual Multi-Token Fairness in Text Classification

论文作者

Lohia, Pranay

论文摘要

反事实代币的生成仅限于在通常简短且单一句子的文本中仅扰动单个令牌。这些令牌通常与许多敏感属性之一有关。由于产生有限的反事实,实现机器学习分类模型的不变性的目标是有限的,并且反事实公平的表述被缩小了。在本文中,我们通过解决根部问题并打开更大的领域来克服这些局限性。我们已经策划了一种敏感令牌及其相应的扰动令牌的资源,甚至将支持扩展到传统使用的敏感属性之外,例如年龄,性别,族裔,国籍,残疾和宗教。反事实生成的概念已扩展到在所有形式的文本和文档上有效的多态支持。我们通过将多个敏感令牌作为反事实多togen生成来定义生成反事实的方法。该方法已被概念化,以展示对单toke方法的显着改进,并通过多个基准数据集进行了验证。反事实一代的修改促进了改善反事实的多言公平。

The counterfactual token generation has been limited to perturbing only a single token in texts that are generally short and single sentences. These tokens are often associated with one of many sensitive attributes. With limited counterfactuals generated, the goal to achieve invariant nature for machine learning classification models towards any sensitive attribute gets bounded, and the formulation of Counterfactual Fairness gets narrowed. In this paper, we overcome these limitations by solving root problems and opening bigger domains for understanding. We have curated a resource of sensitive tokens and their corresponding perturbation tokens, even extending the support beyond traditionally used sensitive attributes like Age, Gender, Race to Nationality, Disability, and Religion. The concept of Counterfactual Generation has been extended to multi-token support valid over all forms of texts and documents. We define the method of generating counterfactuals by perturbing multiple sensitive tokens as Counterfactual Multi-token Generation. The method has been conceptualized to showcase significant performance improvement over single-token methods and validated over multiple benchmark datasets. The emendation in counterfactual generation propagates in achieving improved Counterfactual Multi-token Fairness.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源