论文标题
使用收缩估计器组合观测和实验数据集
Combining Observational and Experimental Datasets Using Shrinkage Estimators
论文作者
论文摘要
我们考虑将观察和实验来源的数据结合起来以得出因果结论的问题。这个问题越来越重要,因为现代时代已经在电子商务和电子健康等领域产生了被动的大量观察数据集。这些数据可用于补充实验数据,这通常是昂贵的。在Rosenman等。 (2018年),我们认为所有混杂因素都均已测量。在这里,我们放松了不符的假设。为了得出具有理想特性的组合估计量,我们利用Stein收缩文献的结果。我们的贡献是三倍。首先,我们提出了一种在这种情况下得出收缩估计器的通用程序,利用了广义的无偏风险估计。其次,我们开发了两个新的估计量,证明了有限样本条件,它们的风险比仅使用实验数据的估计量低,并表明每个估计值都达到了渐近最佳性的概念。第三,我们在敏感性分析中建立了连接,包括提出一种评估估计器可行性的方法。
We consider the problem of combining data from observational and experimental sources to make causal conclusions. This problem is increasingly relevant, as the modern era has yielded passive collection of massive observational datasets in areas such as e-commerce and electronic health. These data may be used to supplement experimental data, which is frequently expensive to obtain. In Rosenman et al. (2018), we considered this problem under the assumption that all confounders were measured. Here, we relax the assumption of unconfoundedness. To derive combined estimators with desirable properties, we make use of results from the Stein Shrinkage literature. Our contributions are threefold. First, we propose a generic procedure for deriving shrinkage estimators in this setting, making use of a generalized unbiased risk estimate. Second, we develop two new estimators, prove finite sample conditions under which they have lower risk than an estimator using only experimental data, and show that each achieves a notion of asymptotic optimality. Third, we draw connections between our approach and results in sensitivity analysis, including proposing a method for evaluating the feasibility of our estimators.