论文标题

从时间序列中学习域特异性因果发现

Learning domain-specific causal discovery from time series

论文作者

Wang, Xinyue, Kording, Konrad Paul

论文摘要

随时间变化数据的因果发现(CD)在神经科学,医学和机器学习中很重要。 CD的技术包含通常没有偏见但昂贵的随机实验,以及诸如Granger因果关系,基于条件的基于条件的基于结构方程式和基于得分的方法,仅在人类设计师做出的强有力的假设下才能准确。但是,正如机器学习的其他领域所证明的那样,人类的专业知识通常并不完全准确,并且在具有丰富数据的域中往往胜过表现。在这项研究中,我们检查是否可以使用数据驱动方法来增强时间序列的特定域特异性因果发现。我们的发现表明,此过程明显胜过人工设计的,领域 - 敏锐的因果发现方法,例如MOS 6502微处理器,NetSim FMRI数据集和Dream3 Gene DataSet上的MOS 6502微处理器上的互信息,Var-Lingam和Granger因果关系。我们认为,在可行的情况下,因果关系领域应考虑一种有监督的方法,其中特定于领域的CD程序是从具有已知因果关系的广泛数据集中学到的,而不是由人类专家设计的。我们的发现有望在神经和医学数据以及更广泛的机器学习社区中提高CD的新方法。

Causal discovery (CD) from time-varying data is important in neuroscience, medicine, and machine learning. Techniques for CD encompass randomized experiments, which are generally unbiased but expensive, and algorithms such as Granger causality, conditional-independence-based, structural-equation-based, and score-based methods that are only accurate under strong assumptions made by human designers. However, as demonstrated in other areas of machine learning, human expertise is often not entirely accurate and tends to be outperformed in domains with abundant data. In this study, we examine whether we can enhance domain-specific causal discovery for time series using a data-driven approach. Our findings indicate that this procedure significantly outperforms human-designed, domain-agnostic causal discovery methods, such as Mutual Information, VAR-LiNGAM, and Granger Causality on the MOS 6502 microprocessor, the NetSim fMRI dataset, and the Dream3 gene dataset. We argue that, when feasible, the causality field should consider a supervised approach in which domain-specific CD procedures are learned from extensive datasets with known causal relationships, rather than being designed by human specialists. Our findings promise a new approach toward improving CD in neural and medical data and for the broader machine learning community.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源