论文标题
随机提升和随机^2森林 - 随机树深度注入方法
Random boosting and random^2 forests -- A random tree depth injection approach
论文作者
论文摘要
在许多方面,在并行和顺序的集合方法中诱导额外的随机性已被证明是值得的。在此手稿中,我们提出并检查一种新型的随机树深度注射方法,适用于基于序列和平行树的方法,包括增强和随机森林。所得的方法称为\ emph {Random Boost}和\ emph {Random $^2 $ forest}。两种方法都可以作为有关梯度增强框架和随机森林的现有文献的宝贵扩展。蒙特卡洛模拟,其中构建了具有不同数量的最终分区的树状数据集,这表明\ emph {Random Boost}和\ emph {Random $^2 $ Forest}有几种情况,可以改善常规等级制度的预测性能。在生成数据中仅有一些高阶交互的情况下,新算法似乎特别成功。此外,我们的仿真表明,我们的随机树深度注入方法可以将计算时间提高高达40%,而同时在预测准确性方面的性能损失在大多数情况下是次要的,甚至可以忽略不计。
The induction of additional randomness in parallel and sequential ensemble methods has proven to be worthwhile in many aspects. In this manuscript, we propose and examine a novel random tree depth injection approach suitable for sequential and parallel tree-based approaches including Boosting and Random Forests. The resulting methods are called \emph{Random Boost} and \emph{Random$^2$ Forest}. Both approaches serve as valuable extensions to the existing literature on the gradient boosting framework and random forests. A Monte Carlo simulation, in which tree-shaped data sets with different numbers of final partitions are built, suggests that there are several scenarios where \emph{Random Boost} and \emph{Random$^2$ Forest} can improve the prediction performance of conventional hierarchical boosting and random forest approaches. The new algorithms appear to be especially successful in cases where there are merely a few high-order interactions in the generated data. In addition, our simulations suggest that our random tree depth injection approach can improve computation time by up to 40%, while at the same time the performance losses in terms of prediction accuracy turn out to be minor or even negligible in most cases.