论文标题
图形感知变压器:注意所有图形都需要吗?
Graph-Aware Transformer: Is Attention All Graphs Need?
论文作者
论文摘要
图是代表许多域中关系和结构信息的自然数据结构。为了涵盖广泛的图形数据应用程序,包括图形分类和图形生成,希望拥有一个通用且灵活的模型,该模型由编码器和可以处理图形数据的解码器组成。尽管具有代表性的编码器模型Transformer在各种任务中尤其是自然语言处理中显示出卓越的性能,但由于其非序列特征,它无法立即用于图形。为了解决这种不兼容,我们提出了图形感知的变压器(GRAT),这是第一个基于变压器的模型,它可以以端到端的方式编码和解码整个图形。 GRAT具有适应边缘信息的自我发挥机制,以及基于两个路径方法的自动回归解码机制,该方法包括每个解码步骤的子图编码路径和节点和边缘生成路径。我们对多个设置进行了经验评估,包括基于编码器的任务,例如QM9数据集上的分子属性预测以及基于编码器的任务,例如有机分子合成域中的分子图生成。 GRAT显示出非常有希望的结果,包括在QM9基准中的4个回归任务上的最新性能。
Graphs are the natural data structure to represent relational and structural information in many domains. To cover the broad range of graph-data applications including graph classification as well as graph generation, it is desirable to have a general and flexible model consisting of an encoder and a decoder that can handle graph data. Although the representative encoder-decoder model, Transformer, shows superior performance in various tasks especially of natural language processing, it is not immediately available for graphs due to their non-sequential characteristics. To tackle this incompatibility, we propose GRaph-Aware Transformer (GRAT), the first Transformer-based model which can encode and decode whole graphs in end-to-end fashion. GRAT is featured with a self-attention mechanism adaptive to the edge information and an auto-regressive decoding mechanism based on the two-path approach consisting of sub-graph encoding path and node-and-edge generation path for each decoding step. We empirically evaluated GRAT on multiple setups including encoder-based tasks such as molecule property predictions on QM9 datasets and encoder-decoder-based tasks such as molecule graph generation in the organic molecule synthesis domain. GRAT has shown very promising results including state-of-the-art performance on 4 regression tasks in QM9 benchmark.