论文标题

Snapboost:一台异构提升机

SnapBoost: A Heterogeneous Boosting Machine

论文作者

Parnell, Thomas, Anghel, Andreea, Lazuka, Malgorzata, Ioannou, Nikolas, Kurella, Sebastian, Agarwal, Peshal, Papandreou, Nikolaos, Pozidis, Haralampos

论文摘要

现代梯度增强软件框架(例如XGBoost和LightGBM)在功能空间中实现牛顿下降。在每次提升迭代中,他们的目标是从某些基本假设类别中找到基本假设,该假设是在欧几里得意义上最接近牛顿下降方向的。通常,基本假设类别固定为所有二进制决策树,直至给定的深度。在这项工作中,我们研究了一个异质的牛顿提升机(HNBM),其中基本假设类别在促进迭代中可能会有所不同。具体而言,在每次提升迭代中,从固定的子类别中选择基本假设类别通过概率分布进行采样。我们在某些假设下得出了HNBM的全球线性收敛速率,并表明它与牛顿方法的现有速率一致,当牛顿方向在每种促进迭代中的基本假设都可以完美地拟合。然后,我们描述了SnapBoost的特定实现,该实现在每次提升的迭代中,在可变深度的决策树或具有随机傅立叶特征的线性回归器之间随机选择。我们描述了如何实施Snapboost,重点是培训复杂性。最后,我们使用OpenML和Kaggle数据集提出了实验结果,这表明Snapboost能够获得比竞争增强框架更好的概括损失,而无需花费更长的时间来调整。

Modern gradient boosting software frameworks, such as XGBoost and LightGBM, implement Newton descent in a functional space. At each boosting iteration, their goal is to find the base hypothesis, selected from some base hypothesis class, that is closest to the Newton descent direction in a Euclidean sense. Typically, the base hypothesis class is fixed to be all binary decision trees up to a given depth. In this work, we study a Heterogeneous Newton Boosting Machine (HNBM) in which the base hypothesis class may vary across boosting iterations. Specifically, at each boosting iteration, the base hypothesis class is chosen, from a fixed set of subclasses, by sampling from a probability distribution. We derive a global linear convergence rate for the HNBM under certain assumptions, and show that it agrees with existing rates for Newton's method when the Newton direction can be perfectly fitted by the base hypothesis at each boosting iteration. We then describe a particular realization of a HNBM, SnapBoost, that, at each boosting iteration, randomly selects between either a decision tree of variable depth or a linear regressor with random Fourier features. We describe how SnapBoost is implemented, with a focus on the training complexity. Finally, we present experimental results, using OpenML and Kaggle datasets, that show that SnapBoost is able to achieve better generalization loss than competing boosting frameworks, without taking significantly longer to tune.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源