合奏多源域与伪标记适应

论文标题

合奏多源域与伪标记适应

Ensemble Multi-Source Domain Adaptation with Pseudolabels

论文作者

Lee, Seongmin, Jeon, Hyunsik, Kang, U

论文摘要

给定多个带有标签的源数据集，我们如何在没有标记数据的情况下训练目标模型？多源域适应（MSDA）旨在在没有目标数据标签的情况下使用与目标数据集不同的多个源数据集训练模型。 MSDA是一个至关重要的问题，适用于许多实际情况，在许多实际情况下，由于隐私问题，目标数据的标签不可用。现有的MSDA框架受到限制，因为它们不考虑每个域的条件分布p（x | y）而对齐数据。他们还通过根本不考虑目标标签并仅依靠一个功能提取器而错过了许多目标标签信息。在本文中，我们建议使用伪标记（ENMDAP）适应集合多源域，这是一种新型的多源域适应性方法。 ENMDAP利用标签矩匹配到对齐条件分布p（x | y），使用伪标记的目标标签，并通过使用多个功能提取器进行准确的域适应来引入集合学习主题。广泛的实验表明，ENMDAP为图像域和文本域中的多源域适应任务提供了最新性能。

Given multiple source datasets with labels, how can we train a target model with no labeled data? Multi-source domain adaptation (MSDA) aims to train a model using multiple source datasets different from a target dataset in the absence of target data labels. MSDA is a crucial problem applicable to many practical cases where labels for the target data are unavailable due to privacy issues. Existing MSDA frameworks are limited since they align data without considering conditional distributions p(x|y) of each domain. They also miss a lot of target label information by not considering the target label at all and relying on only one feature extractor. In this paper, we propose Ensemble Multi-source Domain Adaptation with Pseudolabels (EnMDAP), a novel method for multi-source domain adaptation. EnMDAP exploits label-wise moment matching to align conditional distributions p(x|y), using pseudolabels for the unavailable target labels, and introduces ensemble learning theme by using multiple feature extractors for accurate domain adaptation. Extensive experiments show that EnMDAP provides the state-of-the-art performance for multi-source domain adaptation tasks in both of image domains and text domains.

下载PDF全文

下载文献需遵守相关版权规定

论文标题