播放很酷：动态转移可防止热门节流

论文标题

播放很酷：动态转移可防止热门节流

Play It Cool: Dynamic Shifting Prevents Thermal Throttling

论文作者

Zhou, Yang, Liang, Feng, Chin, Ting-wu, Marculescu, Diana

论文摘要

机器学习（ML）进入了移动时代，其中将大量的ML模型部署在边缘设备上。但是，在边缘设备上运行常见的ML模型可能会从计算中产生过多的热量，从而迫使该设备“减速”以防止过热，这一现象称为热节流。本文研究热节流对手机的影响：当它发生时，CPU时钟频率会降低，并且模型推断潜伏期可能会大大增加。这种不愉快的不一致行为对用户体验产生了重大的负面影响，但长期以来一直被忽视。为了应对热门节流，我们建议利用具有共享权重的动态网络，并根据其热模型在系统即将变为节气门时将大型和小ML模型在大型和小型ML模型之间动态移动。随着提出的动态变化，该应用程序始终运行，而不会出现CPU时钟频率降解和延迟增加。此外，当部署动态转移时，我们还研究了由此产生的准确性，并表明我们的方法在模型延迟和模型准确性之间提供了合理的权衡。

Machine learning (ML) has entered the mobile era where an enormous number of ML models are deployed on edge devices. However, running common ML models on edge devices continuously may generate excessive heat from the computation, forcing the device to "slow down" to prevent overheating, a phenomenon called thermal throttling. This paper studies the impact of thermal throttling on mobile phones: when it occurs, the CPU clock frequency is reduced, and the model inference latency may increase dramatically. This unpleasant inconsistent behavior has a substantial negative effect on user experience, but it has been overlooked for a long time. To counter thermal throttling, we propose to utilize dynamic networks with shared weights and dynamically shift between large and small ML models seamlessly according to their thermal profile, i.e., shifting to a small model when the system is about to throttle. With the proposed dynamic shifting, the application runs consistently without experiencing CPU clock frequency degradation and latency increase. In addition, we also study the resulting accuracy when dynamic shifting is deployed and show that our approach provides a reasonable trade-off between model latency and model accuracy.

下载PDF全文

下载文献需遵守相关版权规定

论文标题