论文标题
迈向机器学习系统中的自动内核选择:SYCL案例研究
Towards automated kernel selection in machine learning systems: A SYCL case study
论文作者
论文摘要
计算内核的自动调整是一个流行的研究领域,主要侧重于找到具有固定输入尺寸的问题的最佳内核参数。这种方法非常适合部署网络拓扑恒定的机器学习模型,但是机器学习研究通常涉及更改网络拓扑和超参数。在这种情况下,传统的内核自动调整影响有限。需要更一般的内核选择来加速机器学习研究。 在本文中,我们在案例研究中使用机器学习来介绍最初的结果,该案例研究部署了高性能SYCL内核,这些库中针对从台式GPU到嵌入式加速器的一系列异质设备的库。研究的技术更普遍地应用,并且可以与其他异质编程系统集成。通过结合自动调整和机器学习这些内核选择过程,可以通过很少的开发人员努力来实现新硬件的高性能。
Automated tuning of compute kernels is a popular area of research, mainly focused on finding optimal kernel parameters for a problem with fixed input sizes. This approach is good for deploying machine learning models, where the network topology is constant, but machine learning research often involves changing network topologies and hyperparameters. Traditional kernel auto-tuning has limited impact in this case; a more general selection of kernels is required for libraries to accelerate machine learning research. In this paper we present initial results using machine learning to select kernels in a case study deploying high performance SYCL kernels in libraries that target a range of heterogeneous devices from desktop GPUs to embedded accelerators. The techniques investigated apply more generally and could similarly be integrated with other heterogeneous programming systems. By combining auto-tuning and machine learning these kernel selection processes can be deployed with little developer effort to achieve high performance on new hardware.