论文标题
来自原则的AMR相似性指标
AMR Similarity Metrics from Principles
论文作者
论文摘要
已经提出了不同的指标来比较抽象含义表示(AMR)图。规范的Smatch度量(CAI和Knight,2013年)对齐两个图的变量,并评估三重匹配。最近的Sembleu Metric(Song and Gildea,2019年)基于机器翻译度量BLEU(Papineni等,2002),并通过消融变量平衡来提高计算效率。 在本文中,i)我们建立标准,使研究人员能够对比较AMR等含义表示的指标进行原则评估; ii)我们对Smatch和Sembleu进行了彻底的分析,在该分析中,我们表明后者表现出一些不良特性。例如,它不符合不明智的规则的身份,而是引入了难以控制的偏见。 iii)我们提出了一种新颖的度量S $^2 $匹配,它对仅略有含义的偏差而更仁慈,并针对所有已建立的标准的实现。我们评估其适用性,并显示出比Smatch和Sembleu的优势。
Different metrics have been proposed to compare Abstract Meaning Representation (AMR) graphs. The canonical Smatch metric (Cai and Knight, 2013) aligns the variables of two graphs and assesses triple matches. The recent SemBleu metric (Song and Gildea, 2019) is based on the machine-translation metric Bleu (Papineni et al., 2002) and increases computational efficiency by ablating the variable-alignment. In this paper, i) we establish criteria that enable researchers to perform a principled assessment of metrics comparing meaning representations like AMR; ii) we undertake a thorough analysis of Smatch and SemBleu where we show that the latter exhibits some undesirable properties. For example, it does not conform to the identity of indiscernibles rule and introduces biases that are hard to control; iii) we propose a novel metric S$^2$match that is more benevolent to only very slight meaning deviations and targets the fulfilment of all established criteria. We assess its suitability and show its advantages over Smatch and SemBleu.