论文标题
SAF:Hadoop纱集群的模拟退火安排
SAF: Simulated Annealing Fair Scheduling for Hadoop Yarn Clusters
论文作者
论文摘要
Apache引入了纱线作为下一代Hadoop框架,提供了资源管理和中心平台,以在Hadoop群集中提供一致的数据治理工具。 Hadoop纱线支持像MapReduce这样的多个框架来处理不同类型的数据,并使用不同的调度策略,例如FIFO,容量和公平调度程序。 DRF是使用短期的最佳选择,而无需考虑历史信息,融合了多类资源分配的公平性。但是,由于资源利用方面的公平性和绩效之间的权衡,DRF的性能仍然无法满足。为了解决这个问题,我们提出了模拟的退火计划,SAF,SAF,这是资源分配中的长期公平计划,以在资源利用和Makepan方面具有公平性和出色的性能。我们引入了一个新的参数作为熵,该参数是指示整个集群分配资源的公平性的方法。我们在Hadoop纱线集群中实现了SAF作为可插入的调度程序,并使用标准MAPREDUCE基准测试了纱线调度程序负载模拟器(SLS)和Cloudsim加上模拟框架。最后,两种模拟工具的结果都是证明我们主张的证据。与DRF相比,SAF可显着提高纱线簇的资源利用,并将使PAN降低到适当的水平。
Apache introduced YARN as the next generation of the Hadoop framework, providing resource management and a central platform to deliver consistent data governance tools across Hadoop clusters. Hadoop YARN supports multiple frameworks like MapReduce to process different types of data and works with different scheduling policies such as FIFO, Capacity, and Fair schedulers. DRF is the best option that uses short-term, without considering history information, convergence to fairness for multi-type resource allocation. However, DRF performance is still not satisfying due to trade-offs between fairness and performance regarding resource utilization. To address this problem, we propose Simulated Annealing Fair scheduling, SAF, a long-term fair scheme in resource allocation to have fairness and excellent performance in terms of resource utilization and MakeSpan. We introduce a new parameter as entropy, which is an approach to indicates the disorder in the fairness of allocated resources of the whole cluster. We implemented SAF as a pluggable scheduler in Hadoop Yarn Cluster and evaluated it with standard MapReduce benchmarks in Yarn Scheduler Load Simulator (SLS) and CloudSim Plus simulation framework. Finally, the results of both simulation tools are evidence to prove our claim. Compared to DRF, SAF increases resource utilization of YARN clusters significantly and decreases MakeSpan to an appropriate level.