SPARQL查询优化的Lothbrok方法在分散的知识图上优化

论文标题

SPARQL查询优化的Lothbrok方法在分散的知识图上优化

The Lothbrok approach for SPARQL Query Optimization over Decentralized Knowledge Graphs

论文作者

Aebeloe, Christian, Montoya, Gabriela, Hose, Katja

论文摘要

尽管数据网络原则上提供了访问广泛的相互联系数据的访问权限，但今天的语义网络架构主要依赖于数据提供商来通过SPARQL端点来维持对数据的访问。然而，一些研究表明，这样的终点通常会经历停机时间，这意味着它们所维持的数据变得无法访问。尽管基于对等（P2P）技术的分散系统以前已证明可以增加知识图的可用性，但即使大部分节点失败，在此设置中处理查询也可能是一项昂贵的任务，因为要回答单个查询所需的数据可能会在多个节点上分布。因此，在本文中，我们提出了一种通过分散的知识图（称为Lothbrok）优化SPARQL查询的方法。虽然在优化此类查询时可能会考虑许多方面，但我们关注三个方面：基数估计，局部意识和数据片段。我们从经验上表明，Lothbrok能够在处理挑战性查询以及网络处于高负载下时，与最新的问题相比，Lothbrok能够达到明显更快的查询处理性能。

While the Web of Data in principle offers access to a wide range of interlinked data, the architecture of the Semantic Web today relies mostly on the data providers to maintain access to their data through SPARQL endpoints. Several studies, however, have shown that such endpoints often experience downtime, meaning that the data they maintain becomes inaccessible. While decentralized systems based on Peer-to-Peer (P2P) technology have previously shown to increase the availability of knowledge graphs, even when a large proportion of the nodes fail, processing queries in such a setup can be an expensive task since data necessary to answer a single query might be distributed over multiple nodes. In this paper, we therefore propose an approach to optimizing SPARQL queries over decentralized knowledge graphs, called Lothbrok. While there are potentially many aspects to consider when optimizing such queries, we focus on three aspects: cardinality estimation, locality awareness, and data fragmentation. We empirically show that Lothbrok is able to achieve significantly faster query processing performance compared to the state of the art when processing challenging queries as well as when the network is under high load.

下载PDF全文

下载文献需遵守相关版权规定

论文标题