论文标题

关于递归分区的角度行为及其对异质因果效应估计的影响

On the Pointwise Behavior of Recursive Partitioning and Its Implications for Heterogeneous Causal Effect Estimation

论文作者

Cattaneo, Matias D., Klusowski, Jason M., Tian, Peter M.

论文摘要

决策树学习越来越多地用于指定。重要的应用包括因果异源治疗效果和动态政策决策,以及实验的条件分位数回归和设计,其中在协变量的特定值下进行树木估计和推理。在本文中,我们提出了质疑决策树的使用(通过自适应递归分区培训),以证明它们可能无法实现统一规范中的多项式收敛速率,即使是在修剪的情况下,也可以实现不利的可能性。取而代之的是,收敛可能会任意缓慢,或者在某些重要的特殊情况下,例如诚实的回归树,完全失败了。我们表明,随机森林可以纠正这种情况,将糟糕的树木变成几乎最佳的程序,以失去解释性和引入两个其他调谐参数为代价。随机森林的两个标志是亚采样和随机特征选择机制,都可以独特地有助于实现所考虑的模型类别的几乎最佳性能。

Decision tree learning is increasingly being used for pointwise inference. Important applications include causal heterogenous treatment effects and dynamic policy decisions, as well as conditional quantile regression and design of experiments, where tree estimation and inference is conducted at specific values of the covariates. In this paper, we call into question the use of decision trees (trained by adaptive recursive partitioning) for such purposes by demonstrating that they can fail to achieve polynomial rates of convergence in uniform norm with non-vanishing probability, even with pruning. Instead, the convergence may be arbitrarily slow or, in some important special cases, such as honest regression trees, fail completely. We show that random forests can remedy the situation, turning poor performing trees into nearly optimal procedures, at the cost of losing interpretability and introducing two additional tuning parameters. The two hallmarks of random forests, subsampling and the random feature selection mechanism, are seen to each distinctively contribute to achieving nearly optimal performance for the model class considered.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源