论文标题
使用特定域特征的分配移位适应
Distributional Shift Adaptation using Domain-Specific Features
论文作者
论文摘要
机器学习算法通常假定培训和测试样本来自相同的分布,即分布。但是,在开放世界的情况下,流式的大数据可能是分发的(OOD),使这些算法无效。 OOD挑战的先前解决方案旨在确定不同训练领域的不变特征。基本的假设是,这些不变特征在未标记的目标域中也应该很好地工作。相比之下,这项工作对特定于域的功能感兴趣,这些功能包括不变特征和目标域特有的功能。我们提出了一种简单而有效的方法,该方法通常依赖于相关性,无论功能是否不变。我们的方法使用了由OOD基本模型(教师模型)确定的最自信的预测样本来训练有效适应目标域的新模型(学生模型)。基准数据集的经验评估表明,SOTA的性能已提高了约10-20%
Machine learning algorithms typically assume that the training and test samples come from the same distributions, i.e., in-distribution. However, in open-world scenarios, streaming big data can be Out-Of-Distribution (OOD), rendering these algorithms ineffective. Prior solutions to the OOD challenge seek to identify invariant features across different training domains. The underlying assumption is that these invariant features should also work reasonably well in the unlabeled target domain. By contrast, this work is interested in the domain-specific features that include both invariant features and features unique to the target domain. We propose a simple yet effective approach that relies on correlations in general regardless of whether the features are invariant or not. Our approach uses the most confidently predicted samples identified by an OOD base model (teacher model) to train a new model (student model) that effectively adapts to the target domain. Empirical evaluations on benchmark datasets show that the performance is improved over the SOTA by ~10-20%