论文标题
多标签因果变量发现:学习常见因果变量和标签特异性因果变量
Multi-label Causal Variable Discovery: Learning Common Causal Variables and Label-specific Causal Variables
论文作者
论文摘要
马尔可夫边界(MB)中的因果变量已被广泛应用于广泛的单标签任务。由于复杂的因果关系,很少有研究重点关注多标签数据中的因果变量发现。由于多标签场景中的某些变量可能包含有关多个标签的因果信息,因此本文研究了多标签因果变量发现的问题,以及与某些单个单个标签相关的多个标签特异性因果变量共享的常见因果变量和多个标签特异性因果变量的区别。考虑到非阳性关节概率分布下的多个MB,我们探讨了常见因果变量与等效信息现象之间的关系,并发现该溶液受到等效信息的影响,遵循不同的机制,具有或不存在标记因果关系。分析这些机制,我们提供了共同因果变量的理论特性,基于发现和区分算法来识别这两种类型的变量。与单标签问题相似,多个标签的因果变量也具有广泛的应用前景。为了证明这一点,我们将提出的因果机制应用于多标签特征选择,并提出可解释的算法,事实证明,该算法可实现最小的冗余性和最大相关性。广泛的实验证明了这些贡献的功效。
Causal variables in Markov boundary (MB) have been widely applied in extensive single-label tasks. While few researches focus on the causal variable discovery in multi-label data due to the complex causal relationships. Since some variables in multi-label scenario might contain causal information about multiple labels, this paper investigates the problem of multi-label causal variable discovery as well as the distinguishing between common causal variables shared by multiple labels and label-specific causal variables associated with some single labels. Considering the multiple MBs under the non-positive joint probability distribution, we explore the relationships between common causal variables and equivalent information phenomenon, and find that the solutions are influenced by equivalent information following different mechanisms with or without existence of label causality. Analyzing these mechanisms, we provide the theoretical property of common causal variables, based on which the discovery and distinguishing algorithm is designed to identify these two types of variables. Similar to single-label problem, causal variables for multiple labels also have extensive application prospects. To demonstrate this, we apply the proposed causal mechanism to multi-label feature selection and present an interpretable algorithm, which is proved to achieve the minimal redundancy and the maximum relevance. Extensive experiments demonstrate the efficacy of these contributions.