论文标题
两个模型的故事:在边缘模型上构建回避攻击
A Tale of Two Models: Constructing Evasive Attacks on Edge Models
论文作者
论文摘要
完整的深度学习模型通常太大或成本很高,无法在边缘设备上部署。为了适应有限的硬件资源,使用各种边缘适应技术(例如量化和修剪)将模型适应边缘。尽管此类技术可能对顶线准确性产生可忽略的影响,但与得出的原始模型相比,改编的模型在产出方面表现出细微的差异。在本文中,我们引入了一种新的回避攻击Diva,该攻击是通过在输入数据中添加对抗性噪声来利用这些差异在边缘适应中的这些差异,从而最大程度地提高了原始模型和适应模型之间的输出差异。这样的攻击特别危险,因为恶意输入将欺骗在边缘上运行的改编模型,但是原始模型实际上将无法检测到,该模型通常用作权威模型版本,用于验证,调试和重新培训。我们将DIVA与最先进的攻击(PGD)进行了比较,并且与PGD相比,原始模型在攻击适应的模型方面只有1.7-3.6%的速度差1.7-3.6%。
Full-precision deep learning models are typically too large or costly to deploy on edge devices. To accommodate to the limited hardware resources, models are adapted to the edge using various edge-adaptation techniques, such as quantization and pruning. While such techniques may have a negligible impact on top-line accuracy, the adapted models exhibit subtle differences in output compared to the original model from which they are derived. In this paper, we introduce a new evasive attack, DIVA, that exploits these differences in edge adaptation, by adding adversarial noise to input data that maximizes the output difference between the original and adapted model. Such an attack is particularly dangerous, because the malicious input will trick the adapted model running on the edge, but will be virtually undetectable by the original model, which typically serves as the authoritative model version, used for validation, debugging and retraining. We compare DIVA to a state-of-the-art attack, PGD, and show that DIVA is only 1.7-3.6% worse on attacking the adapted model but 1.9-4.2 times more likely not to be detected by the the original model under a whitebox and semi-blackbox setting, compared to PGD.