论文标题
图像分类器的鲁棒性和不变特性
Robustness and invariance properties of image classifiers
论文作者
论文摘要
深度神经网络在许多图像分类任务中取得了令人印象深刻的结果。但是,由于通常在受控设置中测量其性能,因此必须确保在嘈杂环境中部署时其决策保持正确。实际上,深层网络对多种语义的图像修改也不强大,即使是不可察觉的图像变化称为对抗性扰动。图像分类器对小数据分布的鲁棒性不足,这引起了人们对其可信度的严重关注。为了构建可靠的机器学习模型,我们必须设计有原则的方法来分析和理解塑造稳健性和不变性的机制。这正是本文的重点。 首先,我们研究计算稀疏对抗扰动的问题。我们利用图像分类器的决策边界的几何形状非常快速地计算稀疏扰动,并揭示对抗性示例与图像分类器学习的数据功能之间的定性连接。然后,为了更好地理解这种连接,我们提出了一个几何框架,将数据示例与决策边界的距离连接到数据中的特征。我们表明,深层分类器对非歧视特征的不变性具有强烈的诱导偏见,并且对抗性训练利用了这一属性来赋予鲁棒性。最后,我们专注于概括性的挑战性问题,即无法预见数据的腐败,并提出了一种新颖的数据增强计划,以实现对图像的共同腐败的最新鲁棒性。 总体而言,我们的结果有助于理解深层图像分类器的基本机制,并为建立可以在现实世界环境中部署的更可靠的机器学习系统铺平了道路。
Deep neural networks have achieved impressive results in many image classification tasks. However, since their performance is usually measured in controlled settings, it is important to ensure that their decisions remain correct when deployed in noisy environments. In fact, deep networks are not robust to a large variety of semantic-preserving image modifications, even to imperceptible image changes known as adversarial perturbations. The poor robustness of image classifiers to small data distribution shifts raises serious concerns regarding their trustworthiness. To build reliable machine learning models, we must design principled methods to analyze and understand the mechanisms that shape robustness and invariance. This is exactly the focus of this thesis. First, we study the problem of computing sparse adversarial perturbations. We exploit the geometry of the decision boundaries of image classifiers for computing sparse perturbations very fast, and reveal a qualitative connection between adversarial examples and the data features that image classifiers learn. Then, to better understand this connection, we propose a geometric framework that connects the distance of data samples to the decision boundary, with the features existing in the data. We show that deep classifiers have a strong inductive bias towards invariance to non-discriminative features, and that adversarial training exploits this property to confer robustness. Finally, we focus on the challenging problem of generalization to unforeseen corruptions of the data, and we propose a novel data augmentation scheme for achieving state-of-the-art robustness to common corruptions of the images. Overall, our results contribute to the understanding of the fundamental mechanisms of deep image classifiers, and pave the way for building more reliable machine learning systems that can be deployed in real-world environments.