将实时深度学习服务安排为不精确的计算

论文标题

将实时深度学习服务安排为不精确的计算

Scheduling Real-time Deep Learning Services as Imprecise Computations

论文作者

Yao, Shuochao, Hao, Yifan, Zhao, Yiran, Shao, Huajie, Liu, Dongxin, Liu, Shengzhong, Wang, Tianshi, Li, Jinyang, Abdelzaher, Tarek

论文摘要

本文为智能实时边缘服务提供了有效的实时调度算法，该算法定义为执行机器智能任务的那些，例如语音识别，激光雷达处理或机器视觉，代表本身无法支持广泛计算的本地嵌入式设备。这项工作有助于实时计算的最新方向，该方向通过随时预测开发了机器智能任务的调度算法。我们表明，深度神经网络工作流程可以作为不精确的计算施放，每个计算都具有强制性零件，并且（几个）可选零件的执行实用程序取决于输入数据。实时调度程序的目的是在满足任务截止日期的同时最大程度地提高深度神经网络输出的平均准确性，这要归功于最不需必要的可选零件的机会。这项工作是由日益普遍但资源受限的嵌入式设备的泛滥（对于从自动驾驶汽车到物联网的应用程序）以及为他们赋予他们智能赋予他们赋予他们的服务的愿望的愿望。关于GPU硬件的实验和机器视觉的最先进的Deep神经网络的实验表明，我们的计划可以将整体准确性提高10％-20％，而无需截止日期。

The paper presents an efficient real-time scheduling algorithm for intelligent real-time edge services, defined as those that perform machine intelligence tasks, such as voice recognition, LIDAR processing, or machine vision, on behalf of local embedded devices that are themselves unable to support extensive computations. The work contributes to a recent direction in real-time computing that develops scheduling algorithms for machine intelligence tasks with anytime prediction. We show that deep neural network workflows can be cast as imprecise computations, each with a mandatory part and (several) optional parts whose execution utility depends on input data. The goal of the real-time scheduler is to maximize the average accuracy of deep neural network outputs while meeting task deadlines, thanks to opportunistic shedding of the least necessary optional parts. The work is motivated by the proliferation of increasingly ubiquitous but resource-constrained embedded devices (for applications ranging from autonomous cars to the Internet of Things) and the desire to develop services that endow them with intelligence. Experiments on recent GPU hardware and a state of the art deep neural network for machine vision illustrate that our scheme can increase the overall accuracy by 10%-20% while incurring (nearly) no deadline misses.

下载PDF全文

下载文献需遵守相关版权规定

论文标题