论文标题
StyleFool:通过样式转移欺骗视频分类系统
StyleFool: Fooling Video Classification Systems via Style Transfer
论文作者
论文摘要
视频分类系统容易受到对抗性攻击的影响,这可能会在视频验证中造成严重的安全问题。当前的黑框攻击需要大量查询才能成功,从而在攻击过程中产生了高度计算的开销。另一方面,受限制扰动的攻击对防御或对抗训练等防御措施无效。在本文中,我们专注于无限制的扰动,并提出了StyleFool,这是一种通过样式转移的黑框视频对抗攻击,以欺骗视频分类系统。 StyleFool首先利用颜色主题接近来选择最佳的样式图像,这有助于避免风格化视频中的不自然细节。同时,在目标攻击中还考虑了目标类置信度,以通过将风格化的视频移到更靠近甚至跨决策边界,以影响分类器的输出分布。然后,采用一种无梯度的方法来进一步优化对抗扰动。我们进行了广泛的实验,以评估两个标准数据集UCF-101和HMDB-51上的StyleFool。实验结果表明,在查询数量和针对现有防御的鲁棒性方面,StyleFool的表现都超过了最先进的对抗性攻击。此外,未经靶向攻击中有50%的程式化视频不需要任何查询,因为它们已经可以欺骗视频分类模型。此外,我们通过一项用户研究评估了无法区分的性,以表明尽管不受限制地扰动,但StyleFool的对抗样本看起来对人的眼睛看不见。
Video classification systems are vulnerable to adversarial attacks, which can create severe security problems in video verification. Current black-box attacks need a large number of queries to succeed, resulting in high computational overhead in the process of attack. On the other hand, attacks with restricted perturbations are ineffective against defenses such as denoising or adversarial training. In this paper, we focus on unrestricted perturbations and propose StyleFool, a black-box video adversarial attack via style transfer to fool the video classification system. StyleFool first utilizes color theme proximity to select the best style image, which helps avoid unnatural details in the stylized videos. Meanwhile, the target class confidence is additionally considered in targeted attacks to influence the output distribution of the classifier by moving the stylized video closer to or even across the decision boundary. A gradient-free method is then employed to further optimize the adversarial perturbations. We carry out extensive experiments to evaluate StyleFool on two standard datasets, UCF-101 and HMDB-51. The experimental results demonstrate that StyleFool outperforms the state-of-the-art adversarial attacks in terms of both the number of queries and the robustness against existing defenses. Moreover, 50% of the stylized videos in untargeted attacks do not need any query since they can already fool the video classification model. Furthermore, we evaluate the indistinguishability through a user study to show that the adversarial samples of StyleFool look imperceptible to human eyes, despite unrestricted perturbations.