论文标题

混合样品增强在线蒸馏

Mixed Sample Augmentation for Online Distillation

论文作者

Shen, Yiqing, Xu, Liwu, Yang, Yuzhe, Li, Yaqian, Guo, Yandong

论文摘要

混合样品正则化(MSR),例如混合或cutmix,是一种强大的数据增强策略,可以推广卷积神经网络。先前的经验分析表明,MSR和传统的离线知识蒸馏(KD)之间的正交性能增长。更具体地说,可以通过MSR参与顺序蒸馏的训练阶段来增强学生网络。然而,在零售商和在线知识蒸馏之间的相互作用,在那里,同龄学生互相学习的合奏仍然没有探索。为了弥合差距,我们首次尝试将cutmix纳入在线蒸馏中,我们从经验上观察到了重大改进。在这一事实的鼓励下,我们提出了一个更强大的MSR,专门用于在线蒸馏,称为cut \ textsuperscript {n}混合。此外,在Cut \ textsuperscript {n}混音上设计了一种新颖的在线蒸馏框架,以通过功能级别的相互学习和自我安装的老师来增强蒸馏。对CIFAR10和CIFAR100进行六个网络体系结构的全面评估表明,我们的方法可以始终超过最先进的蒸馏方法。

Mixed Sample Regularization (MSR), such as MixUp or CutMix, is a powerful data augmentation strategy to generalize convolutional neural networks. Previous empirical analysis has illustrated an orthogonal performance gain between MSR and conventional offline Knowledge Distillation (KD). To be more specific, student networks can be enhanced with the involvement of MSR in the training stage of sequential distillation. Yet, the interplay between MSR and online knowledge distillation, where an ensemble of peer students learn mutually from each other, remains unexplored. To bridge the gap, we make the first attempt at incorporating CutMix into online distillation, where we empirically observe a significant improvement. Encouraged by this fact, we propose an even stronger MSR specifically for online distillation, named as Cut\textsuperscript{n}Mix. Furthermore, a novel online distillation framework is designed upon Cut\textsuperscript{n}Mix, to enhance the distillation with feature level mutual learning and a self-ensemble teacher. Comprehensive evaluations on CIFAR10 and CIFAR100 with six network architectures show that our approach can consistently outperform state-of-the-art distillation methods.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源