论文标题
建立和解释深厚的相似性模型
Building and Interpreting Deep Similarity Models
论文作者
论文摘要
许多学习算法,例如内核机器,最近的邻居,聚类或异常检测,都是基于“距离”或“相似性”的概念。在使用相似之处来训练实际的机器学习模型之前,我们要验证它们是否与数据中有意义的模式绑定在一起。在本文中,我们建议通过用输入功能来增强它们来提高相似之处。我们开发了BILRP,这是一种可扩展的和理论上创建的方法,用于系统分解成对的输入特征。我们的方法可以表达为LRP解释的组成,在先前的工作中显示,以扩展到高度非线性函数。通过一系列广泛的实验,我们证明了BILRP可靠地解释了复杂的相似性模型,例如基于VGG-16深神经网络功能。此外,我们将我们的方法应用于数字人文科学中的一个开放问题:详细评估历史文档(例如天文表)之间的相似性。同样,BILRP提供了洞察力,并将可验证性带入了高度工程和特定问题的相似性模型中。
Many learning algorithms such as kernel machines, nearest neighbors, clustering, or anomaly detection, are based on the concept of 'distance' or 'similarity'. Before similarities are used for training an actual machine learning model, we would like to verify that they are bound to meaningful patterns in the data. In this paper, we propose to make similarities interpretable by augmenting them with an explanation in terms of input features. We develop BiLRP, a scalable and theoretically founded method to systematically decompose similarity scores on pairs of input features. Our method can be expressed as a composition of LRP explanations, which were shown in previous works to scale to highly nonlinear functions. Through an extensive set of experiments, we demonstrate that BiLRP robustly explains complex similarity models, e.g. built on VGG-16 deep neural network features. Additionally, we apply our method to an open problem in digital humanities: detailed assessment of similarity between historical documents such as astronomical tables. Here again, BiLRP provides insight and brings verifiability into a highly engineered and problem-specific similarity model.