在攻击深神经网络中攻击外域的不确定性估计

论文标题

在攻击深神经网络中攻击外域的不确定性估计

On Attacking Out-Domain Uncertainty Estimation in Deep Neural Networks

论文作者

Zeng, Huimin, Yue, Zhenrui, Zhang, Yang, Kou, Ziyi, Shang, Lanyu, Wang, Dong

论文摘要

在许多具有现实世界后果的应用中，对AI决策系统的预测进行可靠的不确定性估计至关重要。针对估计不确定性的目标，已经提出了基于各种深层神经网络（DNN）的不确定性估计算法。但是，这些算法返回的不确定性的鲁棒性尚未系统地探索。在这项工作中，为了提高研究界对鲁棒不确定性估计的认识，我们表明，在我们提出的对抗性攻击下，最新的不确定性估计算法可能会在灾难性的情况下灾难性地失败，尽管它们对不确定性估计的表现令人印象深刻。特别是，我们旨在攻击外域的不确定性估计：在我们的攻击下，不确定性模型将被愚弄，以对外域数据进行高度自信的预测，他们最初会拒绝。各种基准图像数据集的广泛实验结果表明，通过最新方法估计的不确定性很容易被我们的攻击损坏。

In many applications with real-world consequences, it is crucial to develop reliable uncertainty estimation for the predictions made by the AI decision systems. Targeting at the goal of estimating uncertainty, various deep neural network (DNN) based uncertainty estimation algorithms have been proposed. However, the robustness of the uncertainty returned by these algorithms has not been systematically explored. In this work, to raise the awareness of the research community on robust uncertainty estimation, we show that state-of-the-art uncertainty estimation algorithms could fail catastrophically under our proposed adversarial attack despite their impressive performance on uncertainty estimation. In particular, we aim at attacking the out-domain uncertainty estimation: under our attack, the uncertainty model would be fooled to make high-confident predictions for the out-domain data, which they originally would have rejected. Extensive experimental results on various benchmark image datasets show that the uncertainty estimated by state-of-the-art methods could be easily corrupted by our attack.

下载PDF全文

下载文献需遵守相关版权规定

论文标题