论文标题
通过数据组合持续混淆的长期因果推断
Long-term Causal Inference Under Persistent Confounding via Data Combination
论文作者
论文摘要
我们研究了实验数据和观察数据时,我们研究了长期治疗效果的识别和估计。由于仅在长时间延迟后才能观察到长期结果,因此在实验数据中没有测量它,而仅在观察数据中记录。但是,两种类型的数据都包括对一些短期结果的观察。在本文中,我们唯一地应对持续未衡量的混杂因素的挑战,即,一些无法同时影响治疗,短期结局和长期结果的未衡量的混杂因素,并指出它们在以前的文献中的识别策略无效。为了应对这一挑战,我们利用了多个短期结局的顺序结构,并为平均长期治疗效果制定了三种新颖的识别策略。我们进一步提出了三个相应的估计量,并证明了它们的渐近一致性和渐近正态性。最终,我们使用我们的方法来估计使用半合成数据的工作培训计划对长期就业的影响。我们从数字上表明,我们的建议优于无法处理持久混杂因素的现有方法。
We study the identification and estimation of long-term treatment effects when both experimental and observational data are available. Since the long-term outcome is observed only after a long delay, it is not measured in the experimental data, but only recorded in the observational data. However, both types of data include observations of some short-term outcomes. In this paper, we uniquely tackle the challenge of persistent unmeasured confounders, i.e., some unmeasured confounders that can simultaneously affect the treatment, short-term outcomes and the long-term outcome, noting that they invalidate identification strategies in previous literature. To address this challenge, we exploit the sequential structure of multiple short-term outcomes, and develop three novel identification strategies for the average long-term treatment effect. We further propose three corresponding estimators and prove their asymptotic consistency and asymptotic normality. We finally apply our methods to estimate the effect of a job training program on long-term employment using semi-synthetic data. We numerically show that our proposals outperform existing methods that fail to handle persistent confounders.