论文标题
上下文感知时间序列的合奏学习
Context-Aware Ensemble Learning for Time Series
论文作者
论文摘要
我们研究在线环境中预测的集合方法。与所有文献相结合不同,我们首次使用元学习者引入了一种新方法,该方法通过使用超级特征的超集来有效地结合基本模型预测,这是基本模型的特征向量的结合,而不是预测本身。在这里,我们的模型不使用基本模型的预测作为机器学习算法的输入,而是根据问题的状态选择每个时间步骤的最佳组合。我们探索了三个不同的约束空间,以使基础学习者的结合线性地结合基础预测,它们是凸组合的组合,其中结合矢量的组成部分都是无负的,最多可总和1;仿射组合需要重量矢量成分才能总和1;以及组件可以自由获得任何实际价值的无约束组合。理论上在已知的统计数据中对约束进行了分析,并以自动化方式集成到元学习者的学习过程中。为了显示所提出方法的实用效率,我们采用了梯度提高的决策树和一个多层感知者,作为元学习者。我们的框架是通用的,因此只要允许自定义的可区分损失以最小化,就可以将其他机器学习体系结构用作结束者。我们证明了算法对合成数据的学习行为以及对各种现实生活数据集的常规方法的显着改进,这些数据集已在众所周知的数据竞争中广泛使用。此外,我们公开共享提出的方法的源代码,以促进进一步的研究和比较。
We investigate ensemble methods for prediction in an online setting. Unlike all the literature in ensembling, for the first time, we introduce a new approach using a meta learner that effectively combines the base model predictions via using a superset of the features that is the union of the base models' feature vectors instead of the predictions themselves. Here, our model does not use the predictions of the base models as inputs to a machine learning algorithm, but choose the best possible combination at each time step based on the state of the problem. We explore three different constraint spaces for the ensembling of the base learners that linearly combines the base predictions, which are convex combinations where the components of the ensembling vector are all nonnegative and sum up to 1; affine combinations where the weight vector components are required to sum up to 1; and the unconstrained combinations where the components are free to take any real value. The constraints are both theoretically analyzed under known statistics and integrated into the learning procedure of the meta learner as a part of the optimization in an automated manner. To show the practical efficiency of the proposed method, we employ a gradient-boosted decision tree and a multi-layer perceptron separately as the meta learners. Our framework is generic so that one can use other machine learning architectures as the ensembler as long as they allow for a custom differentiable loss for minimization. We demonstrate the learning behavior of our algorithm on synthetic data and the significant performance improvements over the conventional methods over various real life datasets, extensively used in the well-known data competitions. Furthermore, we openly share the source code of the proposed method to facilitate further research and comparison.