论文标题
TCNL:通过嵌入人类指导的概念透明且可控的网络学习
TCNL: Transparent and Controllable Network Learning Via Embedding Human-Guided Concepts
论文作者
论文摘要
解释深度学习模型对于理解人工智能系统,提高安全性和评估公平性至关重要。为了更好地理解和控制CNN模型,已经提出了许多用于透明度切换性的方法。但是,这些作品中的大多数对人类的理解不太直观,并且对CNN模型的人类控制不足。我们提出了一种新颖的方法,即透明和可控的网络学习(TCNL),以克服此类挑战。为了提高透明度可解关性,在TCNL中,我们通过科学的人类信息研究为特定分类任务定义了一些概念,并将概念信息纳入CNN模型。在TCNL中,浅功能提取器首先获得初步功能。然后,在浅层提取器之后立即构建几个概念特征提取器,以学习高维概念表示。鼓励概念特征提取器编码与预定义概念有关的信息。我们还构建了概念映射器,以以人直觉的方式可视化概念提取器提取的特征。 TCNL提供了一种可推广的方法来透明度解干性。研究人员可以定义与某些分类任务相对应的概念,并鼓励该模型编码特定的概念信息,在一定程度上可以提高透明度的解释性和CNN模型的可控性。我们的实验的数据集(带有概念集)也将发布(https://github.com/bupt-ai-cz/tcnl)。
Explaining deep learning models is of vital importance for understanding artificial intelligence systems, improving safety, and evaluating fairness. To better understand and control the CNN model, many methods for transparency-interpretability have been proposed. However, most of these works are less intuitive for human understanding and have insufficient human control over the CNN model. We propose a novel method, Transparent and Controllable Network Learning (TCNL), to overcome such challenges. Towards the goal of improving transparency-interpretability, in TCNL, we define some concepts for specific classification tasks through scientific human-intuition study and incorporate concept information into the CNN model. In TCNL, the shallow feature extractor gets preliminary features first. Then several concept feature extractors are built right after the shallow feature extractor to learn high-dimensional concept representations. The concept feature extractor is encouraged to encode information related to the predefined concepts. We also build the concept mapper to visualize features extracted by the concept extractor in a human-intuitive way. TCNL provides a generalizable approach to transparency-interpretability. Researchers can define concepts corresponding to certain classification tasks and encourage the model to encode specific concept information, which to a certain extent improves transparency-interpretability and the controllability of the CNN model. The datasets (with concept sets) for our experiments will also be released (https://github.com/bupt-ai-cz/TCNL).