步行和搭配：一种基于随机步行的算法，用于在稀疏知识图上学习

论文标题

步行和搭配：一种基于随机步行的算法，用于在稀疏知识图上学习

Walk-and-Relate: A Random-Walk-based Algorithm for Representation Learning on Sparse Knowledge Graphs

论文作者

Manchanda, Saurav

论文摘要

知识图（kg）嵌入技术使用实体之间的结构化关系来学习实体和关系的低维表示。传统的KG嵌入技术（例如Transe和Distmult）通过在观察到的KG三胞胎上开发的简单模型来估算这些嵌入。这些方法的三胞胎得分损失功能有所不同。由于这些模型仅使用观察到的三胞胎来估计嵌入，因此它们很容易受到通常发生在现实世界知识图中的数据稀疏性，即每个实体缺乏足够的三胞胎。为了解决这个问题，我们提出了一种有效的方法来增加三胞胎数量，以解决数据稀疏问题。我们使用随机步行来创建其他三胞胎，以便这些引入三胞胎的关系需要随机步行引起的元塔特。我们还提供了准确有效地从随机步行引起的可能的元tapaths中准确有效地滤除信息性元素的方法。所提出的方法是模型不合时式，并且可以将增强培训数据集与开箱即用的任何KG嵌入方法一起使用。在基准数据集上获得的实验结果显示了所提出方法的优势。

Knowledge graph (KG) embedding techniques use structured relationships between entities to learn low-dimensional representations of entities and relations. The traditional KG embedding techniques (such as TransE and DistMult) estimate these embeddings via simple models developed over observed KG triplets. These approaches differ in their triplet scoring loss functions. As these models only use the observed triplets to estimate the embeddings, they are prone to suffer through data sparsity that usually occurs in the real-world knowledge graphs, i.e., the lack of enough triplets per entity. To settle this issue, we propose an efficient method to augment the number of triplets to address the problem of data sparsity. We use random walks to create additional triplets, such that the relations carried by these introduced triplets entail the metapath induced by the random walks. We also provide approaches to accurately and efficiently filter out informative metapaths from the possible set of metapaths, induced by the random walks. The proposed approaches are model-agnostic, and the augmented training dataset can be used with any KG embedding approach out of the box. Experimental results obtained on the benchmark datasets show the advantages of the proposed approach.

下载PDF全文

下载文献需遵守相关版权规定

论文标题