论文标题
深网中共识的动态和嘈杂标签的识别
The Dynamic of Consensus in Deep Networks and the Identification of Noisy Labels
论文作者
论文摘要
深层神经网络具有令人难以置信的能力和表现力,并且似乎可以记住任何训练集。在嘈杂标签的存在下训练时,这引入了一个问题,因为在训练结束时无法将嘈杂的例子与干净的例子区分开。最近的研究通过利用了深层网络似乎比嘈杂的例子更早地记住干净示例的事实来解决了这一挑战。在这里,我们报告了一个新的经验结果:在每个示例中,当查看每个模型在网络集合中记住的时间时,嘈杂的示例中看到的多样性比干净的示例大得多。我们使用此观察结果来开发一种用于嘈杂标签过滤的新方法。该方法基于数据的统计数据,该数据捕获了清洁和嘈杂数据之间集合学习动态的差异。我们在三个任务上测试我们的方法:(i)噪声量估计; (ii)噪声过滤; (iii)监督分类。我们表明,使用各种数据集,噪声模型和噪声水平,我们的方法在所有三个任务中都对现有基线进行了改进。除了提高性能外,我们的方法还具有另外两个优势。 (i)简单性,这意味着没有引入其他超级参数。 (ii)我们的方法是模块化的:它不能以端到端的方式工作,因此可以用来清洁以后使用的任何其他用法。
Deep neural networks have incredible capacity and expressibility, and can seemingly memorize any training set. This introduces a problem when training in the presence of noisy labels, as the noisy examples cannot be distinguished from clean examples by the end of training. Recent research has dealt with this challenge by utilizing the fact that deep networks seem to memorize clean examples much earlier than noisy examples. Here we report a new empirical result: for each example, when looking at the time it has been memorized by each model in an ensemble of networks, the diversity seen in noisy examples is much larger than the clean examples. We use this observation to develop a new method for noisy labels filtration. The method is based on a statistics of the data, which captures the differences in ensemble learning dynamics between clean and noisy data. We test our method on three tasks: (i) noise amount estimation; (ii) noise filtration; (iii) supervised classification. We show that our method improves over existing baselines in all three tasks using a variety of datasets, noise models, and noise levels. Aside from its improved performance, our method has two other advantages. (i) Simplicity, which implies that no additional hyperparameters are introduced. (ii) Our method is modular: it does not work in an end-to-end fashion, and can therefore be used to clean a dataset for any other future usage.