论文标题
哪些策略对于嘈杂的标签分类很重要?洞悉损失和不确定性
Which Strategies Matter for Noisy Label Classification? Insight into Loss and Uncertainty
论文作者
论文摘要
标签噪声是降低深神经网络的概括性能的关键因素,从而导致现实世界中的严重问题。现有的研究采用了基于损失或不确定性的策略来解决嘈杂的标签,具有讽刺意味的是,某些策略相互矛盾:强调或丢弃不确定的样本或专注于高或低损失样本。为了阐明相反的策略如何增强模型性能并提供嘈杂标签的培训的见解,我们就整个培训过程中样本的损失和不确定性值如何变化介绍了分析结果。从深入的分析中,我们设计了一种新的健壮训练方法,该方法强调清洁和信息丰富的样本,同时使用损失和不确定性最大程度地减少噪声的影响。我们通过对各种深度学习模型的合成和现实数据集进行了广泛的实验来证明我们的方法的有效性。结果表明,我们的方法明显优于其他最先进的方法,无论神经网络体系结构如何,通常都可以使用。
Label noise is a critical factor that degrades the generalization performance of deep neural networks, thus leading to severe issues in real-world problems. Existing studies have employed strategies based on either loss or uncertainty to address noisy labels, and ironically some strategies contradict each other: emphasizing or discarding uncertain samples or concentrating on high or low loss samples. To elucidate how opposing strategies can enhance model performance and offer insights into training with noisy labels, we present analytical results on how loss and uncertainty values of samples change throughout the training process. From the in-depth analysis, we design a new robust training method that emphasizes clean and informative samples, while minimizing the influence of noise using both loss and uncertainty. We demonstrate the effectiveness of our method with extensive experiments on synthetic and real-world datasets for various deep learning models. The results show that our method significantly outperforms other state-of-the-art methods and can be used generally regardless of neural network architectures.