在简短文档中产生可解释模型的歧视性表达

论文标题

在简短文档中产生可解释模型的歧视性表达

Discriminatory Expressions to Produce Interpretable Models in Short Documents

论文作者

Francisco, Manuel, Castro, Juan Luis

论文摘要

社交网站（SNS）是最重要的交流方式之一。特别是，由于其特殊性（及时性，短文...），微博网站被用作分析途径。有无数的研究以新颖的方式使用SNS，但是机器学习主要集中在分类性能上，而不是解释性和/或其他善良指标。因此，最新的模型是黑匣子，不应用于解决可能产生社会影响的问题。当问题需要透明度时，有必要构建可解释的管道。尽管分类器可能是可以解释的，但生成的模型太复杂了，无法被认为是可理解的，这使得人类不可能理解实际决策。本文提出了一种功能选择机制，能够通过使用更少但更有意义的功能来提高可理解性，同时在强制性的微博上获得良好的性能。此外，我们提出了一种评估统计相关性和偏见特征的排名方法。我们使用五个不同的数据集进行了详尽的测试，以评估模型的分类性能，泛化能力和复杂性。结果表明，就准确性，概括和理解性而言，我们的建议更好，并且是最稳定的建议。

Social Networking Sites (SNS) are one of the most important ways of communication. In particular, microblogging sites are being used as analysis avenues due to their peculiarities (promptness, short texts...). There are countless researches that use SNS in novel manners, but machine learning has focused mainly in classification performance rather than interpretability and/or other goodness metrics. Thus, state-of-the-art models are black boxes that should not be used to solve problems that may have a social impact. When the problem requires transparency, it is necessary to build interpretable pipelines. Although the classifier may be interpretable, resulting models are too complex to be considered comprehensible, making it impossible for humans to understand the actual decisions. This paper presents a feature selection mechanism that is able to improve comprehensibility by using less but more meaningful features while achieving good performance in microblogging contexts where interpretability is mandatory. Moreover, we present a ranking method to evaluate features in terms of statistical relevance and bias. We conducted exhaustive tests with five different datasets in order to evaluate classification performance, generalisation capacity and complexity of the model. Results show that our proposal is better and the most stable one in terms of accuracy, generalisation and comprehensibility.

下载PDF全文

下载文献需遵守相关版权规定

论文标题