论文标题

ORC:使用在线角色更改的基于网络组的知识蒸馏

ORC: Network Group-based Knowledge Distillation using Online Role Change

论文作者

Choi, Junyong, Cho, Hyeon, Cheung, Seokhwa, Hwang, Wonjun

论文摘要

在知识蒸馏中,由于一个无所不能的教师网络无法解决所有问题,因此最近已经研究了多个基于教师的知识蒸馏。但是,有时他们的改进不如预期的那样好,因为一些未成熟的老师可能会将虚假知识转移给学生。在本文中,为了克服这一限制并采用多个网络的功效,我们分别将多个网络分为教师和学生群体。也就是说,学生群是一组不成熟的网络,需要学习教师的知识,而教师群体由能够成功教授的选定网络组成。我们提出了我们的在线角色变更策略,其中学生团体中排名最高的网络能够在每次迭代时向教师组推广。在使用学生组的错误样本来培训教师组以完善教师小组的知识之后,我们成功地将协作知识从教师组转移到了学生群体。我们验证提出的方法在CIFAR-10,CIFAR-100和Imagenet上的优越性,该方法可实现高性能。我们通过各种主链体系结构(例如Resnet,WRN,VGG,Mobilenet和Shufflenet)进一步展示了我们方法的通用性。

In knowledge distillation, since a single, omnipotent teacher network cannot solve all problems, multiple teacher-based knowledge distillations have been studied recently. However, sometimes their improvements are not as good as expected because some immature teachers may transfer the false knowledge to the student. In this paper, to overcome this limitation and take the efficacy of the multiple networks, we divide the multiple networks into teacher and student groups, respectively. That is, the student group is a set of immature networks that require learning the teacher's knowledge, while the teacher group consists of the selected networks that are capable of teaching successfully. We propose our online role change strategy where the top-ranked networks in the student group are able to promote to the teacher group at every iteration. After training the teacher group using the error samples of the student group to refine the teacher group's knowledge, we transfer the collaborative knowledge from the teacher group to the student group successfully. We verify the superiority of the proposed method on CIFAR-10, CIFAR-100, and ImageNet which achieves high performance. We further show the generality of our method with various backbone architectures such as ResNet, WRN, VGG, Mobilenet, and Shufflenet.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源