论文标题
从标签比例学习:相互污染框架
Learning from Label Proportions: A Mutual Contamination Framework
论文作者
论文摘要
从标签比例学习(LLP)是一个弱监督的分类环境,其中未标记的培训实例将其分组为袋子,每个袋子都注释了该袋中每个班级的比例。关于LLP的先前工作尚未建立一致的学习程序,也没有理论上合理的通用培训标准。在这项工作中,我们通过在相互污染模型(MCM)方面提出LLP来解决这两个问题,这些污染模型最近已成功地研究了其他各种薄弱的监督环境。在此过程中,我们为MCM建立了几个新的技术结果,包括在非IID采样计划下的无偏损失和概括误差界限。我们还指出了LLP常见实验设置的局限性,并根据我们的MCM框架提出了一个新的实验设置。
Learning from label proportions (LLP) is a weakly supervised setting for classification in which unlabeled training instances are grouped into bags, and each bag is annotated with the proportion of each class occurring in that bag. Prior work on LLP has yet to establish a consistent learning procedure, nor does there exist a theoretically justified, general purpose training criterion. In this work we address these two issues by posing LLP in terms of mutual contamination models (MCMs), which have recently been applied successfully to study various other weak supervision settings. In the process, we establish several novel technical results for MCMs, including unbiased losses and generalization error bounds under non-iid sampling plans. We also point out the limitations of a common experimental setting for LLP, and propose a new one based on our MCM framework.