论文标题
FixMatch:以一致性和信心简化半监督学习
FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence
论文作者
论文摘要
半监督学习(SSL)提供了利用未标记数据以改善模型性能的有效手段。在本文中,我们证明了两种常见SSL方法的简单组合的功能:一致性正则化和伪标记。我们的算法FixMatch首先使用该模型对弱点未标记的图像的预测生成伪标记。对于给定的图像,仅当模型产生高信心预测时,才保留伪标签。然后,训练该模型以预测伪标签时,当喂养同一图像的强烈版本时。尽管它很简单,但我们表明FixMatch在各种标准的半监督学习基准中实现了最先进的性能,其中包括CIFAR-10的94.93%的精度,具有250个标签和88.61%的精度,其精度为40-40-仅4个标签。由于FixMatch与现有的SSL方法具有许多相似之处,因此我们进行了一项广泛的消融研究,以取消对FixMatch成功最重要的实验因素。我们在https://github.com/google-research/fixmatch上提供代码。
Semi-supervised learning (SSL) provides an effective means of leveraging unlabeled data to improve a model's performance. In this paper, we demonstrate the power of a simple combination of two common SSL methods: consistency regularization and pseudo-labeling. Our algorithm, FixMatch, first generates pseudo-labels using the model's predictions on weakly-augmented unlabeled images. For a given image, the pseudo-label is only retained if the model produces a high-confidence prediction. The model is then trained to predict the pseudo-label when fed a strongly-augmented version of the same image. Despite its simplicity, we show that FixMatch achieves state-of-the-art performance across a variety of standard semi-supervised learning benchmarks, including 94.93% accuracy on CIFAR-10 with 250 labels and 88.61% accuracy with 40 -- just 4 labels per class. Since FixMatch bears many similarities to existing SSL methods that achieve worse performance, we carry out an extensive ablation study to tease apart the experimental factors that are most important to FixMatch's success. We make our code available at https://github.com/google-research/fixmatch.