可区分不变的因果发现

论文标题

可区分不变的因果发现

Differentiable Invariant Causal Discovery

论文作者

Wang, Yu, Zhang, An, Wang, Xiang, Yuan, Yancheng, He, Xiangnan, Chua, Tat-Seng

论文摘要

从观察数据中学习因果结构是机器学习的基本挑战。但是，大多数常用的可区分因果发现方法是不可识别的，将此问题变成了容易发生数据偏见的连续优化任务。在许多现实生活中，数据是从不同环境中收集的，在不同的环境中，功能关系在整个环境中保持一致，而添加噪声的分布可能会有所不同。本文提出了基于可区分框架的多环境信息，提出了可区分不变的因果发现（DICD），以避免学习虚假边缘和错误的因果方向。具体而言，DICD旨在在消除环境依赖性相关性的同时发现环境不变的因果关系。我们进一步制定了强制执行目标结构方程模型的约束，以在整个环境中保持最佳状态。在轻度条件下提供了足够的环境，提供了针对拟议DICD的可识别性的理论保证。关于合成和现实世界数据集的广泛实验证明，DICD优于最先进的因果发现方法，在SHD中最高36％。我们的代码将是开源的。

Learning causal structure from observational data is a fundamental challenge in machine learning. However, the majority of commonly used differentiable causal discovery methods are non-identifiable, turning this problem into a continuous optimization task prone to data biases. In many real-life situations, data is collected from different environments, in which the functional relations remain consistent across environments, while the distribution of additive noises may vary. This paper proposes Differentiable Invariant Causal Discovery (DICD), utilizing the multi-environment information based on a differentiable framework to avoid learning spurious edges and wrong causal directions. Specifically, DICD aims to discover the environment-invariant causation while removing the environment-dependent correlation. We further formulate the constraint that enforces the target structure equation model to maintain optimal across the environments. Theoretical guarantees for the identifiability of proposed DICD are provided under mild conditions with enough environments. Extensive experiments on synthetic and real-world datasets verify that DICD outperforms state-of-the-art causal discovery methods up to 36% in SHD. Our code will be open-sourced.

下载PDF全文

下载文献需遵守相关版权规定

论文标题