作为先验的知识：数据集的跨模式知识概括而没有卓越的知识

论文标题

作为先验的知识：数据集的跨模式知识概括而没有卓越的知识

Knowledge as Priors: Cross-Modal Knowledge Generalization for Datasets without Superior Knowledge

论文作者

Zhao, Long, Peng, Xi, Chen, Yuxiao, Kapadia, Mubbasir, Metaxas, Dimitris N.

论文摘要

跨模式知识蒸馏涉及将知识从训练有素的模式（老师）训练的模型转移到以弱模式（学生）训练的另一个模型。现有的方法需要配对的培训示例以两种方式存在。但是，从上级方式访问数据可能并不总是可行的。例如，在3D手姿势估计的情况下，深度图，点云或立体声图像通常比RGB图像捕获更好的手工结构，但其中大多数的收集成本很高。在本文中，我们提出了一个新颖的计划，以在无法获得的老师的目标数据集中培训学生。我们的关键思想是通过从两种模式中包含配对示例的蒸馏式跨模式知识概括到目标数据集中，该知识通过将知识建模为对学生参数的先验进行建模。我们将我们的方法称为“跨模式知识概括”，并证明我们的方案在标准基准数据集上的3D手姿势估算中会导致竞争性能。

Cross-modal knowledge distillation deals with transferring knowledge from a model trained with superior modalities (Teacher) to another model trained with weak modalities (Student). Existing approaches require paired training examples exist in both modalities. However, accessing the data from superior modalities may not always be feasible. For example, in the case of 3D hand pose estimation, depth maps, point clouds, or stereo images usually capture better hand structures than RGB images, but most of them are expensive to be collected. In this paper, we propose a novel scheme to train the Student in a Target dataset where the Teacher is unavailable. Our key idea is to generalize the distilled cross-modal knowledge learned from a Source dataset, which contains paired examples from both modalities, to the Target dataset by modeling knowledge as priors on parameters of the Student. We name our method "Cross-Modal Knowledge Generalization" and demonstrate that our scheme results in competitive performance for 3D hand pose estimation on standard benchmark datasets.

下载PDF全文

下载文献需遵守相关版权规定

论文标题