论文标题
迈向使用贝叶斯方法的最佳概率积极学习
Toward Optimal Probabilistic Active Learning Using a Bayesian Approach
论文作者
论文摘要
在许多应用程序中,收集标记为训练良好的机器学习模型来训练良好的机器学习模型是一个关键挑战之一。积极学习的目的是通过有效有效的昂贵标签资源分配来降低标签成本。在本文中,我们提出了一种决策理论选择策略,该策略(1)直接优化错误分类错误的增益,(2)通过引入共轭先验分布来确定班级后验,以处理不确定性。通过在我们提出的模型中重新制定现有的选择策略,我们可以解释当前最新的最新方面不涵盖哪些方面,以及为什么这会导致我们的方法的出色表现。在各种数据集和不同内核上进行了广泛的实验验证了我们的主张。
Gathering labeled data to train well-performing machine learning models is one of the critical challenges in many applications. Active learning aims at reducing the labeling costs by an efficient and effective allocation of costly labeling resources. In this article, we propose a decision-theoretic selection strategy that (1) directly optimizes the gain in misclassification error, and (2) uses a Bayesian approach by introducing a conjugate prior distribution to determine the class posterior to deal with uncertainties. By reformulating existing selection strategies within our proposed model, we can explain which aspects are not covered in current state-of-the-art and why this leads to the superior performance of our approach. Extensive experiments on a large variety of datasets and different kernels validate our claims.