不断发展的多标签模糊分类器

论文标题

不断发展的多标签模糊分类器

Evolving Multi-Label Fuzzy Classifier

论文作者

Lughofer, Edwin

论文摘要

多标签分类吸引了机器学习社区中的很多关注，以解决将单个样本分配给一个以上类别的问题。我们提出了一个不断发展的多标签模糊分类器（EFC-ML），该分类器能够以增量的，单人通道的方式自加工和自我进化其结构。它基于多输出的takagi-sugeno型体系结构，在该类别中，为每个类别定义了单独的超级平面。该学习过程嵌入了局部加权的基于增量相关性的算法，并结合了（常规）递归模糊加权最小二乘和基于套索的正则化。基于相关的部分确保了类标签之间的相互关系，即正确保留了多标签分类中的特定众所周知的属性，以改善性能；在较高数量的输入的情况下，基于套索的正则化减少了维数效应的诅咒。通过产品空间聚类来实现先前的学习，并为所有类标签一起进行，从而产生单个规则基础，从而允许紧凑的知识视图。此外，我们的方法还采用在线积极学习（AL）策略，仅在许多选定的样本上更新分类器，这又使该方法适用于应用程序中几乎没有标记的流，其中注释工作通常昂贵。与（不断发展的）一击或分类器链接概念相比，我们对来自Mulan存储库的多个数据集进行了评估，并显示出明显提高的分类精度。一个重要的结果是，由于在线AL方法，与大多数数据集案例的完整更新相比，用于分类器更新的样品的数量减少了90 \％。

Multi-label classification has attracted much attention in the machine learning community to address the problem of assigning single samples to more than one class at the same time. We propose an evolving multi-label fuzzy classifier (EFC-ML) which is able to self-adapt and self-evolve its structure with new incoming multi-label samples in an incremental, single-pass manner. It is based on a multi-output Takagi-Sugeno type architecture, where for each class a separate consequent hyper-plane is defined. The learning procedure embeds a locally weighted incremental correlation-based algorithm combined with (conventional) recursive fuzzily weighted least squares and Lasso-based regularization. The correlation-based part ensures that the interrelations between class labels, a specific well-known property in multi-label classification for improved performance, are preserved properly; the Lasso-based regularization reduces the curse of dimensionality effects in the case of a higher number of inputs. Antecedent learning is achieved by product-space clustering and conducted for all class labels together, which yields a single rule base, allowing a compact knowledge view. Furthermore, our approach comes with an online active learning (AL) strategy for updating the classifier on just a number of selected samples, which in turn makes the approach applicable for scarcely labelled streams in applications, where the annotation effort is typically expensive. Our approach was evaluated on several data sets from the MULAN repository and showed significantly improved classification accuracy compared to (evolving) one-versus-rest or classifier chaining concepts. A significant result was that, due to the online AL method, a 90\% reduction in the number of samples used for classifier updates had little effect on the accumulated accuracy trend lines compared to a full update in most data set cases.

下载PDF全文

下载文献需遵守相关版权规定

论文标题