A gradient boosting tree model for multi-department venous thromboembolism risk assessment with imbalanced data

计算机科学机器学习决策树梯度升压人工智能人口树（集合论）任务（项目管理）数据挖掘风险评估 Boosting（机器学习）医学随机森林环境卫生数学分析计算机安全经济管理数学

作者

Handong Ma,Zhecheng Dong,Mingcheng Chen,Wenbo Sheng,Yao Li,Weinan Zhang,Shaodian Zhang,Yong Yu

出处

期刊：Journal of Biomedical Informatics [Elsevier BV]
日期：2022-09-16 卷期号：134: 104210-104210 被引量：4

链接

nih.govdoi.org

标识

DOI：10.1016/j.jbi.2022.104210

摘要

Venous thromboembolism (VTE) is the world's third most common cause of vascular mortality and a serious complication from multiple departments. Risk assessment of VTE guides clinical intervention in time and is of great importance to in-hospital patients. Traditional VTE risk assessment methods based on scaling tools, which always require rules carefully designed by human experts, are difficult to apply to large-population scenarios since the manually designed rules are not guaranteed to be accurate to all populations. In contrast, with the development of the electronic health record (EHR) datasets, data-driven machine-learning-based risk assessment methods have proven superior predictability in many studies in recent years. This paper uses the gradient boosting tree model to study the VTE risk assessment problem with multi-department data. There exist two distinct characteristics of VTE data collected at the level of the entire hospital: its wide distribution and heterogeneity across multiple departments. To this end, we consider the prediction task over multiple departments as a multi-task learning process, and introduce the algorithm of a task-aware tree-based method TSGB to tackle the multi-task prediction problem. Although the introduction of multi-task learning improves overall across-department performance, we reveal the problem of task-wise performance decline while dealing with imbalanced VTE data volume. According to the analysis, we finally propose two variants of TSGB to alleviate the problems and further boost the prediction performance. Compared with state-of-the-art rule-based and multi-task tree-based methods, the experimental results show the proposed methods not only improve the overall across-department AUC performance effectively, but also ensure the improvement of performance over every single department prediction.

求助该文献

最长约 10秒，即可获得该文献文件

A gradient boosting tree model for multi-department venous thromboembolism risk assessment with imbalanced data

今日热心研友