论文标题

小任务粒度折衷:平衡高架与平行系统中的性能

The Tiny-Tasks Granularity Trade-Off: Balancing overhead vs. performance in parallel systems

论文作者

Bora, Stefan, Walker, Brenton, Fidler, Markus

论文摘要

并行处理系统的模型通常假设一个具有$ l $工人的工人,而作业被分为$ k = l $任务。将作业分为$ k> l $较小的任务,即使用``微小的任务''可以产生性能和稳定性的提高,因为它减少了分配给每个工人的工作量的差异,但是随着$ k $的增加,涉及调度和管理任务的间接费用开始超过绩效收益。我们对任务粒度对Apache Spark群集的影响进行了广泛的实验,并基于这些实验开发了一个四参数模型,用于任务和工作开销,在模拟中,它会产生与真实系统相匹配的Sojourn时间分布。我们还提出了分析结果,该结果说明了如何使用微小的任务改善分裂系统的稳定性区域,以及在寄居和等待时间分布的分析范围和分配分布和单个标题叉子加入系统具有微小的任务。最后,我们将开销模型与分析模型相结合,以对系统的寄居和等待时间分布产生分析近似,并具有包括开销的微小任务。尽管不再严格的分析界限,但这些近似值在分裂合并和叉-Join病例中都很好地匹配了火花实验结果。

Models of parallel processing systems typically assume that one has $l$ workers and jobs are split into an equal number of $k=l$ tasks. Splitting jobs into $k > l$ smaller tasks, i.e. using ``tiny tasks'', can yield performance and stability improvements because it reduces the variance in the amount of work assigned to each worker, but as $k$ increases, the overhead involved in scheduling and managing the tasks begins to overtake the performance benefit. We perform extensive experiments on the effects of task granularity on an Apache Spark cluster, and based on these, developed a four-parameter model for task and job overhead that, in simulation, produces sojourn time distributions that match those of the real system. We also present analytical results which illustrate how using tiny tasks improves the stability region of split-merge systems, and analytical bounds on the sojourn and waiting time distributions of both split-merge and single-queue fork-join systems with tiny tasks. Finally we combine the overhead model with the analytical models to produce an analytical approximation to the sojourn and waiting time distributions of systems with tiny tasks which include overhead. Though no longer strict analytical bounds, these approximations matched the Spark experimental results very well in both the split-merge and fork-join cases.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源