论文标题

从观察数据中的地球科学和遥感的原因推断

Causal Inference in Geoscience and Remote Sensing from Observational Data

论文作者

Pérez-Suay, Adrián, Camps-Valls, Gustau

论文摘要

从观察数据中建立随机变量之间的因果关系也许是当今\ blue {science}的最重要挑战。在遥感和地球科学中,这与更好地了解地球系统以及管理过程之间的复杂相互作用具有特殊相关性。在本文中,我们专注于观察性因果推断,因此我们尝试使用一组有限的经验数据来估算正确的因果方向。此外,我们专注于更复杂的双变量场景,该方案需要强大的假设,并且不使用条件独立性测试。特别是,我们探讨了(非确定性)加性噪声模型的框架,该模型依赖于原因和生成机制之间的独立性。这种原理的实际算法实例化仅需要1)向前和向后方向上的两个回归模型,以及2)在获得的残差和观察结果之间估计{\ EM统计独立}的估计。导致更独立残留物的方向决定是原因。相反,我们提出了一个使用依赖性估计量的{\ em敏感性}(衍生物)的标准,灵敏度标准允许识别影响依赖度度量的最大样本,因此该标准对于伪造的检测是可靠的。我们说明了28个地球科学因果推理问题的表现,在辐射转移模型的数据库中模拟和机器学习模拟器在植被参数建模中,涉及182个问题,并评估碳循环问题中不同回归模型的影响。在所有情况下,标准都达到了最先进的检测率,通常对噪声源和扭曲都有坚固的态度。

Establishing causal relations between random variables from observational data is perhaps the most important challenge in today's \blue{science}. In remote sensing and geosciences this is of special relevance to better understand the Earth's system and the complex interactions between the governing processes. In this paper, we focus on observational causal inference, thus we try to estimate the correct direction of causation using a finite set of empirical data. In addition, we focus on the more complex bivariate scenario that requires strong assumptions and no conditional independence tests can be used. In particular, we explore the framework of (non-deterministic) additive noise models, which relies on the principle of independence between the cause and the generating mechanism. A practical algorithmic instantiation of such principle only requires 1) two regression models in the forward and backward directions, and 2) the estimation of {\em statistical independence} between the obtained residuals and the observations. The direction leading to more independent residuals is decided to be the cause. We instead propose a criterion that uses the {\em sensitivity} (derivative) of the dependence estimator, the sensitivity criterion allows to identify samples most affecting the dependence measure, and hence the criterion is robust to spurious detections. We illustrate performance in a collection of 28 geoscience causal inference problems, in a database of radiative transfer models simulations and machine learning emulators in vegetation parameter modeling involving 182 problems, and in assessing the impact of different regression models in a carbon cycle problem. The criterion achieves state-of-the-art detection rates in all cases, it is generally robust to noise sources and distortions.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源