论文标题
适应在线标签转移,并提供可证明的保证
Adapting to Online Label Shift with Provable Guarantees
论文作者
论文摘要
当培训数据共享与即将到来的测试样本相同的分布时,标准监督学习范式有效地工作。但是,这种固定的假设通常在现实世界应用中违反,尤其是在以在线方式出现测试数据时。在本文中,我们制定和调查了\ emph {在线标签shift}(OLAS)的问题:学习者从标记的离线数据中训练初始模型,然后将其部署到未标记的在线环境中,其中基础标签分布会随着时间的推移而变化,但标签密度的密度却没有。非平稳性质和缺乏监督使问题具有挑战性。为了解决难度,我们构建了一个新的无偏风险估计器,该风险估计器利用了未标记的数据,该数据表现出许多良性特性,尽管具有潜在的非跨性别性。在此基础上,我们提出了新颖的在线合奏算法来应对环境的非平稳性。我们的方法享有最佳的\ emph {动态遗憾},表明该性能与千里眼的千里眼竞争,后者在事后了解在线环境,然后选择每个回合的最佳决定。获得的动态遗憾结合量表与标签分布转移的强度和模式,因此在OLAS问题中表现出适应性。进行广泛的实验以验证有效性和支持我们的理论发现。
The standard supervised learning paradigm works effectively when training data shares the same distribution as the upcoming testing samples. However, this stationary assumption is often violated in real-world applications, especially when testing data appear in an online fashion. In this paper, we formulate and investigate the problem of \emph{online label shift} (OLaS): the learner trains an initial model from the labeled offline data and then deploys it to an unlabeled online environment where the underlying label distribution changes over time but the label-conditional density does not. The non-stationarity nature and the lack of supervision make the problem challenging to be tackled. To address the difficulty, we construct a new unbiased risk estimator that utilizes the unlabeled data, which exhibits many benign properties albeit with potential non-convexity. Building upon that, we propose novel online ensemble algorithms to deal with the non-stationarity of the environments. Our approach enjoys optimal \emph{dynamic regret}, indicating that the performance is competitive with a clairvoyant who knows the online environments in hindsight and then chooses the best decision for each round. The obtained dynamic regret bound scales with the intensity and pattern of label distribution shift, hence exhibiting the adaptivity in the OLaS problem. Extensive experiments are conducted to validate the effectiveness and support our theoretical findings.