论文标题

医疗保健成本预测:利用细颗粒的时间模式

Healthcare Cost Prediction: Leveraging Fine-grain Temporal Patterns

论文作者

Morid, Mohammad Amin, Sheng, Olivia R. Liu, Kawamoto, Kensaku, Ault, Travis, Dorius, Josette, Abdelrahman, Samir

论文摘要

目的:设计和评估一种利用个人时间数据来预测其医疗保健成本的方法。为了实现这一目标,我们首先以细晶粒形式使用了患者的时间数据,而不是粗粒状形式。其次,我们设计了新颖的尖峰检测功能,以提取时间模式,以提高成本预测的性能。第三,我们根据成本信息评估了不同类型的时间特征的有效性,访问信息和医疗信息以进行预测任务。 材料和方法:从2013年到2016年,我们使用了三年的医疗和药房索赔数据,从医疗保险公司那里使用了前两年来建立模型以预测第三年的成本。为了准备建模和预测的数据,以精细元素功能的形式提取成本,访问和医疗信息的时间序列数据(即,将每个时间序列分割为一系列连续的窗口,并通过各种统计信息(如SUM)表示每个窗口)。然后,提取了时间序列的时间模式,并使用一组新型的尖峰检测特征(即数据点的波动)添加到细颗粒特征中。在最终提取的特征集上应用了梯度提升。此外,评估了每种类型的数据(即成本,访问和医疗)的贡献。 结论:利用细粒的时间模式进行医疗保健成本预测可显着提高预测绩效。通过提取时间成本来增强细粒度的功能,并访问模式可显着提高性能。但是,医学功能对预测性能没有重大影响。梯度提升的表现优于所有其他预测模型。

Objective: To design and assess a method to leverage individuals' temporal data for predicting their healthcare cost. To achieve this goal, we first used patients' temporal data in their fine-grain form as opposed to coarse-grain form. Second, we devised novel spike detection features to extract temporal patterns that improve the performance of cost prediction. Third, we evaluated the effectiveness of different types of temporal features based on cost information, visit information and medical information for the prediction task. Materials and methods: We used three years of medical and pharmacy claims data from 2013 to 2016 from a healthcare insurer, where the first two years were used to build the model to predict the costs in the third year. To prepare the data for modeling and prediction, the time series data of cost, visit and medical information were extracted in the form of fine-grain features (i.e., segmenting each time series into a sequence of consecutive windows and representing each window by various statistics such as sum). Then, temporal patterns of the time series were extracted and added to fine-grain features using a novel set of spike detection features (i.e., the fluctuation of data points). Gradient Boosting was applied on the final set of extracted features. Moreover, the contribution of each type of data (i.e., cost, visit and medical) was assessed. Conclusions: Leveraging fine-grain temporal patterns for healthcare cost prediction significantly improves prediction performance. Enhancing fine-grain features with extraction of temporal cost and visit patterns significantly improved the performance. However, medical features did not have a significant effect on prediction performance. Gradient Boosting outperformed all other prediction models.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源