论文标题

不要在不验证真实性的情况下解释:对可解释的AI评估视频活动识别

Don't Explain without Verifying Veracity: An Evaluation of Explainable AI with Video Activity Recognition

论文作者

Nourani, Mahsan, Roy, Chiradeep, Rahman, Tahrima, Ragan, Eric D., Ruozzi, Nicholas, Gogate, Vibhav

论文摘要

可解释的机器学习和人工智能模型已被用来证明模型的决策过程是合理的。这种增加的透明度旨在帮助提高用户性能和对基础模型的理解。但是,实际上,可解释的系统面临许多开放的问题和挑战。具体来说,设计师可能会降低深度学习模型的复杂性,以提供可解释性。但是,这些简化模型产生的解释可能无法准确地证明并对模型诚实。这可能会进一步增加对用户的混乱,因为他们可能找不到有关模型预测有意义的解释。了解这些解释如何影响用户行为是一个持续的挑战。在本文中,我们探讨了解释如何准确影响智能系统中的用户性能和一致性。通过具有可解释的活动识别系统的对照用户研究,我们比较了解释的差异,以确立视频审查和查询任务的真实性。结果表明,与准确的解释和没有解释的系统相比,低真实解释可显着降低用户的性能和一致性。这些发现表明了准确,可以理解的解释和谨慎的重要性,即糟糕的解释有时比没有关于其对用户绩效和对AI系统依赖的影响的解释更糟糕。

Explainable machine learning and artificial intelligence models have been used to justify a model's decision-making process. This added transparency aims to help improve user performance and understanding of the underlying model. However, in practice, explainable systems face many open questions and challenges. Specifically, designers might reduce the complexity of deep learning models in order to provide interpretability. The explanations generated by these simplified models, however, might not accurately justify and be truthful to the model. This can further add confusion to the users as they might not find the explanations meaningful with respect to the model predictions. Understanding how these explanations affect user behavior is an ongoing challenge. In this paper, we explore how explanation veracity affects user performance and agreement in intelligent systems. Through a controlled user study with an explainable activity recognition system, we compare variations in explanation veracity for a video review and querying task. The results suggest that low veracity explanations significantly decrease user performance and agreement compared to both accurate explanations and a system without explanations. These findings demonstrate the importance of accurate and understandable explanations and caution that poor explanations can sometimes be worse than no explanations with respect to their effect on user performance and reliance on an AI system.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源