论文标题
在缺失条目的情况下,随机森林算法的一致性
On the Consistency of a Random Forest Algorithm in the Presence of Missing Entries
论文作者
论文摘要
本文解决了在不完整的信息中给出潜在变量时构建非参数预测变量的问题。该任务的方便预测因子是与所谓的购物车标准结合使用的随机森林算法。所提出的技术以适合回归函数的一致估计器以及缺失值的部分恢复的方式,使数据集中缺失值的部分归档。在每个潜在变量完全随机丢失(MCAR)的情况下,给出了随机森林估计器一致性的证明。
This paper tackles the problem of constructing a non-parametric predictor when the latent variables are given with incomplete information. The convenient predictor for this task is the random forest algorithm in conjunction to the so-called CART criterion. The proposed technique enables a partial imputation of the missing values in the data set in a way that suits both a consistent estimator of the regression function as well as a partial recovery of the missing values. A proof of the consistency of the random forest estimator is given in the case where each latent variable is missing completely at random (MCAR).