深度学习模型的Blackbox Trojanising：使用非侵入性网络结构和二进制更改

论文标题

深度学习模型的Blackbox Trojanising：使用非侵入性网络结构和二进制更改

Blackbox Trojanising of Deep Learning Models : Using non-intrusive network structure and binary alterations

论文作者

Pan, Jonathan

论文摘要

人工智能的最新进展，即深度学习中的进步提高了其在许多应用中的采用。有些人扮演着重要的角色，在我们的生计中我们很大程度上依赖它们。但是，与所有技术一样，恶意演员可能会利用一些漏洞。一种剥削的形式是将这些技术（旨在善良）转换为双重插入仪器，以支持像恶意软件Trojans这样的偏差行为。作为积极辩护的一部分，研究人员正在主动确定这种漏洞，以便随后可以制定保护措施。这项研究使用简单的网络结构修改了任何深度学习图像分类模型，探索了一种新颖的黑盒木马化方法，该模型将良性模型转换为一个偏差模型，并简单地操纵权重以引起特定类型的错误。在这项研究中讨论了保护这种简单利用的命题。这项研究强调了为这些模型提供足够的保护措施的重要性，以便可以保护AI创新和采用的预期利益。

Recent advancements in Artificial Intelligence namely in Deep Learning has heightened its adoption in many applications. Some are playing important roles to the extent that we are heavily dependent on them for our livelihood. However, as with all technologies, there are vulnerabilities that malicious actors could exploit. A form of exploitation is to turn these technologies, intended for good, to become dual-purposed instruments to support deviant acts like malicious software trojans. As part of proactive defense, researchers are proactively identifying such vulnerabilities so that protective measures could be developed subsequently. This research explores a novel blackbox trojanising approach using a simple network structure modification to any deep learning image classification model that would transform a benign model into a deviant one with a simple manipulation of the weights to induce specific types of errors. Propositions to protect the occurrence of such simple exploits are discussed in this research. This research highlights the importance of providing sufficient safeguards to these models so that the intended good of AI innovation and adoption may be protected.

下载PDF全文

下载文献需遵守相关版权规定

论文标题