加速使用GPU的图形机学习的图形采样

论文标题

加速使用GPU的图形机学习的图形采样

Accelerating Graph Sampling for Graph Machine Learning using GPUs

论文作者

Jangda, Abhinav, Polisetty, Sandeep, Guha, Arjun, Serafini, Marco

论文摘要

表示学习算法自动学习数据的功能。用于图形数据的几种表示学习算法，例如DeepWalk，Node2Vec和图形，对图进行了采样，以产生适合训练DNN的迷你批次。但是，采样时间可能是训练时间的很大一部分，并且现有系统不能有效地平行采样。采样是一个令人尴尬的并行问题，似乎可以使自己加速加速，但是图的不规则性使得很难有效地使用GPU资源。本文介绍了NextDoor，该系统旨在有效地对GPU进行图形采样。 NextDoor采用一种新的方法来绘制采样方法，我们称之为Transit-Parallelisl，允许边缘负载平衡和缓存。 NextDoor为最终用户提供了高级抽象，用于编写各种图形采样算法。我们实施了多个图形采样应用程序，并表明NextDoor的运行速度比现有系统快。

Representation learning algorithms automatically learn the features of data. Several representation learning algorithms for graph data, such as DeepWalk, node2vec, and GraphSAGE, sample the graph to produce mini-batches that are suitable for training a DNN. However, sampling time can be a significant fraction of training time, and existing systems do not efficiently parallelize sampling. Sampling is an embarrassingly parallel problem and may appear to lend itself to GPU acceleration, but the irregularity of graphs makes it hard to use GPU resources effectively. This paper presents NextDoor, a system designed to effectively perform graph sampling on GPUs. NextDoor employs a new approach to graph sampling that we call transit-parallelism, which allows load balancing and caching of edges. NextDoor provides end-users with a high-level abstraction for writing a variety of graph sampling algorithms. We implement several graph sampling applications, and show that NextDoor runs them orders of magnitude faster than existing systems.

下载PDF全文

下载文献需遵守相关版权规定

论文标题