具有标签和功能协作的部分多标签学习

论文标题

具有标签和功能协作的部分多标签学习

Partial Multi-label Learning with Label and Feature Collaboration

论文作者

Yu, Tingting, Yu, Guoxian, Wang, Jun, Guo, Maozu

论文摘要

部分多标签学习（PML）模拟了每个培训实例都用一组候选标签注释的场景，只有一些标签是相关的。在实际情况下，PML问题是实用的，因为很难获得精确标记的样本。已经提出了几种PML解决方案，以与隐藏在候选标签中的无关标签的俯卧误导，但它们通常集中在特征空间中的平滑度假设或标签空间中低级假设的平稳性假设上，而忽略了特征和标签之间的负面信息。具体而言，如果两个实例在很大程度上具有重叠的候选标签，而不论其特征相似性，那么它们的地面真实标签应相似。虽然它们在功能和候选标签空间中不同，但其地面标签应彼此不同。为了实现PML数据的可靠预测指标，我们提出了一种称为PML-LFC（具有标签和功能协作的部分多标签学习）的新方法。 PML-LFC使用标签和特征空间的相似性估算每个实例的相关标签的置信值，并以估计的置信值训练所需的预测变量。 PML-LFC通过统一模型以相互增强的方式实现预测因子和潜在标签矩阵，并开发了一种替代优化程序来优化它们。对合成数据集和现实数据集的广泛实证研究表明了PML-LFC的优势。

Partial multi-label learning (PML) models the scenario where each training instance is annotated with a set of candidate labels, and only some of the labels are relevant. The PML problem is practical in real-world scenarios, as it is difficult and even impossible to obtain precisely labeled samples. Several PML solutions have been proposed to combat with the prone misled by the irrelevant labels concealed in the candidate labels, but they generally focus on the smoothness assumption in feature space or low-rank assumption in label space, while ignore the negative information between features and labels. Specifically, if two instances have largely overlapped candidate labels, irrespective of their feature similarity, their ground-truth labels should be similar; while if they are dissimilar in the feature and candidate label space, their ground-truth labels should be dissimilar with each other. To achieve a credible predictor on PML data, we propose a novel approach called PML-LFC (Partial Multi-label Learning with Label and Feature Collaboration). PML-LFC estimates the confidence values of relevant labels for each instance using the similarity from both the label and feature spaces, and trains the desired predictor with the estimated confidence values. PML-LFC achieves the predictor and the latent label matrix in a reciprocal reinforce manner by a unified model, and develops an alternative optimization procedure to optimize them. Extensive empirical study on both synthetic and real-world datasets demonstrates the superiority of PML-LFC.

下载PDF全文

下载文献需遵守相关版权规定

论文标题