论文标题
通过过度参数化在标签噪声下进行的强大训练
Robust Training under Label Noise by Over-parameterization
论文作者
论文摘要
最近,与培训样本相比,具有越来越多的网络参数的过度参数深网已经主导了现代机器学习的性能。但是,当培训数据被损坏时,众所周知,过度参数化的网络往往会过度拟合并且不会概括。在这项工作中,我们提出了一种原则性的方法,用于在分类任务中对过度参数深网的强大培训,其中一部分培训标签被损坏。主要想法还很简单:标签噪声与从干净的数据中学到的网络稀疏且不一致,因此我们对噪声进行建模并学会将其与数据分开。具体而言,我们通过另一个稀疏的过度参数术语对标签噪声进行建模,并利用隐式算法正规化来恢复和分离基础损坏。值得注意的是,当在实践中使用如此简单的方法培训时,我们证明了针对各种真实数据集上标签噪声的最新测试精度。此外,在简化的线性模型上,理论证实了我们的实验结果,表明在不连贯的条件下可以实现稀疏噪声和低级别数据之间的精确分离。这项工作打开了许多有趣的方向,用于使用稀疏的过度参数化和隐式正则化来改善过度参数化模型。
Recently, over-parameterized deep networks, with increasingly more network parameters than training samples, have dominated the performances of modern machine learning. However, when the training data is corrupted, it has been well-known that over-parameterized networks tend to overfit and do not generalize. In this work, we propose a principled approach for robust training of over-parameterized deep networks in classification tasks where a proportion of training labels are corrupted. The main idea is yet very simple: label noise is sparse and incoherent with the network learned from clean data, so we model the noise and learn to separate it from the data. Specifically, we model the label noise via another sparse over-parameterization term, and exploit implicit algorithmic regularizations to recover and separate the underlying corruptions. Remarkably, when trained using such a simple method in practice, we demonstrate state-of-the-art test accuracy against label noise on a variety of real datasets. Furthermore, our experimental results are corroborated by theory on simplified linear models, showing that exact separation between sparse noise and low-rank data can be achieved under incoherent conditions. The work opens many interesting directions for improving over-parameterized models by using sparse over-parameterization and implicit regularization.