论文标题
数据遗忘的多学院算法
Data Oblivious Algorithms for Multicores
论文作者
论文摘要
随着安全的处理器(例如Intel SGX(带有超线程))被广泛采用,因此对大数据的私人分析的需求日益增长。大多数先前的关于数据合并算法的工作采用经典的婴儿车模型来捕获并行性。但是,人们广泛理解的是,婴儿车并未最好地捕获现实的多项处理器,也不反映实践中采用的平行编程模型。 在本文中,我们启动了对现实的多孔的平行数据遗漏算法的研究,该算法是由二进制计算模型捕获的。我们首先表明,可以通过具有最佳的总工作和最佳(可缓存)缓存复杂度的二进制叉-Join算法来完成合并数据的排序,并且在O(log n log log log n)跨度(即并行时间)中,与最著名的Insecure Algorithm匹配。使用我们的排序算法作为核心原始性,我们展示了如何以非平凡的效率模拟二进制叉-Join模型中的一般婴儿车算法。我们还提供了几种应用程序的结果,包括列表排名,欧拉游览,树木收缩,连接的组件和最小跨越森林。对于这些应用程序的子集,我们的数据界算法渐变性均优于最著名的不安全算法。对于其他应用程序,我们显示了数据遗漏算法的性能界限与最知名的不安全算法相匹配。 与这些渐进效率的结果相辅相成,我们提出了分类算法的实用变体,该变体是独立的,可以实现的。它具有最佳的缓存成本,它只是最佳工作和大约一个log n因子的对数n因子的关闭。此外,它在其范围内实现了少量的恒定因素。
As secure processors such as Intel SGX (with hyperthreading) become widely adopted, there is a growing appetite for private analytics on big data. Most prior works on data-oblivious algorithms adopt the classical PRAM model to capture parallelism. However, it is widely understood that PRAM does not best capture realistic multicore processors, nor does it reflect parallel programming models adopted in practice. In this paper, we initiate the study of parallel data oblivious algorithms on realistic multicores, best captured by the binary fork-join model of computation. We first show that data-oblivious sorting can be accomplished by a binary fork-join algorithm with optimal total work and optimal (cache-oblivious) cache complexity, and in O(log n log log n) span (i.e., parallel time) that matches the best-known insecure algorithm. Using our sorting algorithm as a core primitive, we show how to data-obliviously simulate general PRAM algorithms in the binary fork-join model with non-trivial efficiency. We also present results for several applications including list ranking, Euler tour, tree contraction, connected components, and minimum spanning forest. For a subset of these applications, our data-oblivious algorithms asymptotically outperform the best known insecure algorithms. For other applications, we show data oblivious algorithms whose performance bounds match the best known insecure algorithms. Complementing these asymptotically efficient results, we present a practical variant of our sorting algorithm that is self-contained and potentially implementable. It has optimal caching cost, and it is only a log log n factor off from optimal work and about a log n factor off in terms of span; moreover, it achieves small constant factors in its bounds.