屋顶的内核方法：有效处理数十亿点

论文标题

屋顶的内核方法：有效处理数十亿点

Kernel methods through the roof: handling billions of points efficiently

论文作者

Meanti, Giacomo, Carratino, Luigi, Rosasco, Lorenzo, Rudi, Alessandro

论文摘要

内核方法为非参数学习提供了一种优雅而有原则的方法，但是到目前为止，在大规模问题中几乎无法使用，因为随着数据规模的数据，幼稚的实现规模较差。最近的进步显示了许多算法思想的好处，例如组合优化，数值线性代数和随机投影。在这里，我们进一步推动这些努力开发和测试一个充分利用GPU硬件的求解器。为此，我们设计了一个预处理的梯度求解器，用于使用多个GPU利用GPU加速度和并行化利用GPU加速度和并行化，从而实现了常见线性代数操作的核心外变体，以确保最佳的硬件利用率。此外，我们优化了不同操作的数值精度，并最大程度地提高了矩阵矢量乘法的效率。结果，我们可以通过实验表明数据集上具有数十亿分的数据集上的戏剧性加速，同时仍然保证了最先进的性能状态。此外，我们使我们的软件可作为易于使用的库提供。

Kernel methods provide an elegant and principled approach to nonparametric learning, but so far could hardly be used in large scale problems, since naïve implementations scale poorly with data size. Recent advances have shown the benefits of a number of algorithmic ideas, for example combining optimization, numerical linear algebra and random projections. Here, we push these efforts further to develop and test a solver that takes full advantage of GPU hardware. Towards this end, we designed a preconditioned gradient solver for kernel methods exploiting both GPU acceleration and parallelization with multiple GPUs, implementing out-of-core variants of common linear algebra operations to guarantee optimal hardware utilization. Further, we optimize the numerical precision of different operations and maximize efficiency of matrix-vector multiplications. As a result we can experimentally show dramatic speedups on datasets with billions of points, while still guaranteeing state of the art performance. Additionally, we make our software available as an easy to use library.

下载PDF全文

下载文献需遵守相关版权规定

论文标题