论文标题
神经网络比人类评估者更有生产力的教师:来自黑盒模型的数据有效知识蒸馏的主动混合
Neural Networks Are More Productive Teachers Than Human Raters: Active Mixup for Data-Efficient Knowledge Distillation from a Blackbox Model
论文作者
论文摘要
我们研究如何通过以数据效率的方式从黑框教师模型中提取知识来培训学生深度识别。此问题的进展可以显着减少对学习高性能视觉识别模型的大规模数据集的依赖性。有两个主要挑战。一个是,应将询问的疑问数量最小化,以节省计算和/或财务成本。另一个是用于知识蒸馏的图像数量应该很少。否则,它违反了我们对减少对大规模数据集的依赖的期望。为了应对这些挑战,我们提出了一种将混合和积极学习融合的方法。前者通过从原始图像的凸壳中采样的大量合成图像有效地增强了少数未标记的图像,而后者则从池中积极选择学生神经网络的辛苦示例,并从教师模型中查询其标签。我们通过广泛的实验来验证我们的方法。
We study how to train a student deep neural network for visual recognition by distilling knowledge from a blackbox teacher model in a data-efficient manner. Progress on this problem can significantly reduce the dependence on large-scale datasets for learning high-performing visual recognition models. There are two major challenges. One is that the number of queries into the teacher model should be minimized to save computational and/or financial costs. The other is that the number of images used for the knowledge distillation should be small; otherwise, it violates our expectation of reducing the dependence on large-scale datasets. To tackle these challenges, we propose an approach that blends mixup and active learning. The former effectively augments the few unlabeled images by a big pool of synthetic images sampled from the convex hull of the original images, and the latter actively chooses from the pool hard examples for the student neural network and query their labels from the teacher model. We validate our approach with extensive experiments.