基于机器学习模型的心血管疾病患者住院费用预测研究——“带病体”商业健康保险策略的视角

姚敏欣

中国医疗保险 ›› 2024, Vol. 0 ›› Issue (6) : 105-118.

中国医疗保险 ›› 2024, Vol. 0 ›› Issue (6) : 105-118. DOI: 10.19546/j.issn.1674-3830.2024.6.015
商保天地

基于机器学习模型的心血管疾病患者住院费用预测研究——“带病体”商业健康保险策略的视角

  • 姚敏欣
作者信息 +

Research on Predicting Hospitalization Costs for Patients with Cardiovascular Disease Based on Machine Learning Models——From the Perspective of the "Disease Carrier" Commercial Health Insurance Strategy

Author information +
文章历史 +

摘要

目的: 针对日益受到关注的带病体商业健康保险投保问题,本研究以心血管疾病为例分析患者住院费用预测与影响因素,为带病体商业健康保险策略的完善提供参考。方法: 综合评估六种主流机器学习算法,并选定模型的参数进行贝叶斯优化,以提高心血管疾病患者住院费用预测准确性。研究深入分析了客户特征、疾病特性和地域属性等关键因素对住院费用的影响,以期揭示这些因素与费用之间的相关性。结果: 梯度提升决策树-高斯过程(GBDF-GP)模型在预测心血管疾病患者住院费用方面表现最佳。分析显示,客户属性是费用的主要影响因素,疾病属性次之,地域因素仅对91至100岁患者群体产生显著影响。建议保险公司应用GBDF-GP模型来拓宽承保范围,并对不同客户群体实施精细化的风险评估。此外,研究结果为推动保险产品的创新和促进保险市场的多元化提供了实证基础。

Abstract

Objective: In view of the increasingly concerned issue of commercial health insurance participation of sick persons, this study took cardiovascular diseases as an example to analyze the prediction and influencing factors of hospitalization costs, so as to provide a reference for the improvement of commercial health insurance strategies of sick persons. Methods: Six mainstream machine learning algorithms were evaluated comprehensively, and the parameters of the model were selected for Bayesian optimization to improve the accuracy of hospitalization cost prediction for patients with cardiovascular diseases. The study deeply analyzed the influence of key factors such as client characteristics, disease characteristics and geographical attributes on hospitalization costs in order to reveal the correlation between these factors and costs. Results: The gradient ascending decision tree-Gaussian process (GBDF-GP) model showed the best performance in predicting hospital expenses of patients with cardiovascular disease. The analysis shows that customer attributes are the main influencing factor of cost, followed by disease attributes, and geographical factors only have a significant impact on patients aged from 91 to 100. It is recommended that insurance companies apply GBDF-GP model to broaden coverage and carry out refined risk assessment for different customer groups. In addition, the research results provide an empirical basis for promoting the innovation of insurance products and promoting the diversification of insurance market.

关键词

心血管疾病 / 住院费用 / 机器学习 / 带病投保 / 商业健康保险

Key words

cardiovascular disease / hospitalization expenses / machine learning / sick persons' participation of insurance; / commercial health insurance

引用本文

导出引用
姚敏欣. 基于机器学习模型的心血管疾病患者住院费用预测研究——“带病体”商业健康保险策略的视角[J]. 中国医疗保险. 2024, 0(6): 105-118 https://doi.org/10.19546/j.issn.1674-3830.2024.6.015
Research on Predicting Hospitalization Costs for Patients with Cardiovascular Disease Based on Machine Learning Models——From the Perspective of the "Disease Carrier" Commercial Health Insurance Strategy[J]. China Health Insurance. 2024, 0(6): 105-118 https://doi.org/10.19546/j.issn.1674-3830.2024.6.015
中图分类号: F840.684    C913.7   

参考文献

[1] 梁志胜.医疗保险对老年医疗服务和健康影响研究[D].南宁:广西医科大学,2017.
[2] 中国银行保险监督管理委员会.中国银保监会办公厅:关于进一步丰富人身保险产品供给的指导意见[R/OL].(2021-10-15)[2022-11-22].https://www.beijing.gov.cn/zhengce/zhengcefagui/qtwj/202204/t20220411_2671471.html.
[3] 李雅婷,江原.保险科技赋能保险价值链[J].中国金融,2021(22):58-59.
[4] 张宁,陈浩,周亮,等.基于机器学习模型的糖尿病带病人群医疗险风险保费测算[J].保险研究,2020(11):79-95.
[5] 赵颖旭,包竹青,高珊,等.考虑老年痴呆症的医疗险住院费用预测与比较——基于机器学习模型[J].保险研究,2020(09):64-76.
[6] 陈伟伟,高润霖,刘力生,等.《中国心血管病报告2017》概要[J].中国循环杂志,2018,33(1):1-8.
[7] THOMAS C, PETER H.Nearest neighbor pattern classification[C].IEEE: transactions on information theory,1967.
[8] BREIAMN L.Random forests[J].Machine learning, 2001, 45(1): 5-32.
[9] LIU T,TING K M,Yu Y,et al.Spectrum of variable-randomtrees[J].Journal of artificial intelligence research, 2008, 32(1): 355-384.
[10] FRIEDMAN J H.Greedy function approximation: a gradient boosting machine[J].Annals of statistics, 2001, 29(5): 1189-1232.
[11] CHEN T, GUESTRIN C.XGBoost: a scalable tree boosting system[C].San Francisco: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2016: 785-794.
[12] KE G L, MENG Q, FINLEY T, et al.LightGBM: a highly efficient gradient boosting decision tree[C].Neural Information Processing Systems Curran Associates Inc, 2017.
[13] 周志华.机器学习[M].北京:清华大学出版社,2018:171-177.
[14] SCHAPIRE E.The strength of weak learnability[J].Machine learning, 1990, 5(2): 197-227.
[15] SNOKE J, LAROCHELLE H, ADAMS R P.Practical Bayesian optimization of machine learning algorithms[J].Advances in neural information processing systems, 2012: 2951-2959.
[16] 世界卫生组织.ICD-10:疾病和有关健康问题的国际统计分类(第10次修订本)[EB/OL].[2022-12-22].https://www.who.int/classifications/classification-of-diseases.
[17] 陆阳,石宝峰,迟国泰,等.基于违约损失逆序最小的非线性信用风险评价模型及实证[J].中国管理科学,2023:1-15.
[18] DEMSAR J.Statistical comparisons of classifiers over multiple data sets[J].Journal of machine learning research, 2006 (7): 1-30.
[19] CHUN W W, CHUN S K, ADITYA, et al.Marital status and risk of cardiovascular diseases: a systematic review and meta-analysis[J].Heart (British Cardiac Society), 2018, 104(23): 1937-1948.
[20] 中国银行保险监督管理委员会.关于印发保险业标准化“十四五”规划的通知[R/OL].(2022-05-11)[2024-06-09].https://www.gov.cn/zhengce/zhengceku/2022-05/28/content_5692816.htm.

基金

国家自然科学基金“基于空间动态时变关联视角的国际股指波动风险测度与控制策略研究”(72161001)

Accesses

Citation

Detail

段落导航
相关文章

/