保险索赔的基于混合树的模型

论文标题

保险索赔的基于混合树的模型

Hybrid Tree-based Models for Insurance Claims

论文作者

Quan, Zhiyu, Wang, Zhiguo, Gan, Guojun, Valdez, Emiliano A.

论文摘要

两部分模型和Tweedie广义线性模型（GLM）已被用于建模短期保险合同的损失成本。对于大多数保险索赔投资组合而言，通常有很大一部分零索赔导致失衡导致这些传统方法的预测准确性较低。本文提出了使用基于树模型的混合结构的使用，该模型涉及两步算法作为这些传统模型的替代方法。第一步是构建分类树，以构建频率概率模型。在第二步中，我们在每个终端节点上使用分类树的弹性净回归模型来构建严重性的分布模型。这种杂种结构捕获了在算法的每个步骤中调整超参数的好处；这样可以提高预测准确性，并且可以进行调整以实现特定的业务目标。我们使用真实和合成数据集研究并比较了与传统的Tweedie模型相关的这种混合树结构的预测性能。我们的经验结果表明，这些基于混合树的模型会产生更准确的预测，而不会失去直觉的解释。

Two-part models and Tweedie generalized linear models (GLMs) have been used to model loss costs for short-term insurance contract. For most portfolios of insurance claims, there is typically a large proportion of zero claims that leads to imbalances resulting in inferior prediction accuracy of these traditional approaches. This article proposes the use of tree-based models with a hybrid structure that involves a two-step algorithm as an alternative approach to these traditional models. The first step is the construction of a classification tree to build the probability model for frequency. In the second step, we employ elastic net regression models at each terminal node from the classification tree to build the distribution model for severity. This hybrid structure captures the benefits of tuning hyperparameters at each step of the algorithm; this allows for improved prediction accuracy and tuning can be performed to meet specific business objectives. We examine and compare the predictive performance of such a hybrid tree-based structure in relation to the traditional Tweedie model using both real and synthetic datasets. Our empirical results show that these hybrid tree-based models produce more accurate predictions without the loss of intuitive interpretation.

下载PDF全文

下载文献需遵守相关版权规定

论文标题