论文标题
产生反事实:迈向文本的控制反事实生成
Generate Your Counterfactuals: Towards Controlled Counterfactual Generation for Text
论文作者
论文摘要
机器学习最近已经看到了巨大的增长,这导致了ML系统的更大采用,用于教育评估,信用风险,医疗保健,就业,刑事司法,仅举几例。 ML和NLP系统的可信赖性是一个关键方面,需要保证他们做出的决策是公平而健壮的。与此相符,我们提出了一个框架GYC,以生成一组反事实文本样本,这对于测试这些ML系统至关重要。我们的主要贡献包括a)我们介绍了GYC,该框架是生成反事实样本的框架,使该生成是合理的,多样的,面向目标和有效的框架,b)我们生成反事实样本,可以将一代引导到相应的条件,例如命名的实体标签,语义角色标签或情感。我们对各个领域的实验结果表明,GYC生成了以上四个特性的反事实文本样本。 GYC生成的反事实可以充当测试案例,以评估模型和任何文本词汇算法。
Machine Learning has seen tremendous growth recently, which has led to larger adoption of ML systems for educational assessments, credit risk, healthcare, employment, criminal justice, to name a few. The trustworthiness of ML and NLP systems is a crucial aspect and requires a guarantee that the decisions they make are fair and robust. Aligned with this, we propose a framework GYC, to generate a set of counterfactual text samples, which are crucial for testing these ML systems. Our main contributions include a) We introduce GYC, a framework to generate counterfactual samples such that the generation is plausible, diverse, goal-oriented, and effective, b) We generate counterfactual samples, that can direct the generation towards a corresponding condition such as named-entity tag, semantic role label, or sentiment. Our experimental results on various domains show that GYC generates counterfactual text samples exhibiting the above four properties. GYC generates counterfactuals that can act as test cases to evaluate a model and any text debiasing algorithm.