论文标题
地狱的邻居:对多租户FPGA的深度学习加速器的电压攻击
Neighbors From Hell: Voltage Attacks Against Deep Learning Accelerators on Multi-Tenant FPGAs
论文作者
论文摘要
由于其灵活性和能源效率,现场编程的门阵列(FPGA)正在广泛使用无数数据中心应用程序。在这些应用中,FPGA在加速低延迟实时深度学习(DL)推论方面显示出令人鼓舞的结果,这已成为许多最终用户应用程序中必不可少的组成部分。随着可以由多个用户共享虚拟化云FPGA的新兴研究方向,基于FPGA的DL加速器的安全方面需要仔细考虑。在这项工作中,我们评估了DL加速器在多元FPGA方案中基于电压的完整性攻击的安全性。我们首先使用不同的攻击者电路在逻辑上和物理上在单独的攻击者角色上孤立的不同攻击者电路对此类攻击的可行性,并且不能被传统的BitStream Checkers标记为恶意电路。我们表明,积极的时钟门控,一种有效的节能技术,也可能是现代FPGA的潜在安全威胁。然后,我们对受害者角色的DL加速器进行攻击,以评估DL模型对对手引起的正时故障的固有弹性。我们发现,即使使用最强的攻击器电路,在以安全的操作频率运行时,DL加速器的预测准确性也不会受到损害。此外,我们可以通过过度锁定DL加速器而不会影响其预测准确性,从而实现1.18-1.31倍的推理性能。
Field-programmable gate arrays (FPGAs) are becoming widely used accelerators for a myriad of datacenter applications due to their flexibility and energy efficiency. Among these applications, FPGAs have shown promising results in accelerating low-latency real-time deep learning (DL) inference, which is becoming an indispensable component of many end-user applications. With the emerging research direction towards virtualized cloud FPGAs that can be shared by multiple users, the security aspect of FPGA-based DL accelerators requires careful consideration. In this work, we evaluate the security of DL accelerators against voltage-based integrity attacks in a multitenant FPGA scenario. We first demonstrate the feasibility of such attacks on a state-of-the-art Stratix 10 card using different attacker circuits that are logically and physically isolated in a separate attacker role, and cannot be flagged as malicious circuits by conventional bitstream checkers. We show that aggressive clock gating, an effective power-saving technique, can also be a potential security threat in modern FPGAs. Then, we carry out the attack on a DL accelerator running ImageNet classification in the victim role to evaluate the inherent resilience of DL models against timing faults induced by the adversary. We find that even when using the strongest attacker circuit, the prediction accuracy of the DL accelerator is not compromised when running at its safe operating frequency. Furthermore, we can achieve 1.18-1.31x higher inference performance by over-clocking the DL accelerator without affecting its prediction accuracy.