计算机科学                        
                
                                
                        
                            机器学习                        
                
                                
                        
                            决策树                        
                
                                
                        
                            梯度升压                        
                
                                
                        
                            人工智能                        
                
                                
                        
                            人口                        
                
                                
                        
                            树(集合论)                        
                
                                
                        
                            任务(项目管理)                        
                
                                
                        
                            数据挖掘                        
                
                                
                        
                            风险评估                        
                
                                
                        
                            Boosting(机器学习)                        
                
                                
                        
                            医学                        
                
                                
                        
                            随机森林                        
                
                                
                        
                            环境卫生                        
                
                                
                        
                            数学分析                        
                
                                
                        
                            计算机安全                        
                
                                
                        
                            经济                        
                
                                
                        
                            管理                        
                
                                
                        
                            数学                        
                
                        
                    
            作者
            
                Handong Ma,Zhecheng Dong,Mingcheng Chen,Wenbo Sheng,Yao Li,Weinan Zhang,Shaodian Zhang,Yong Yu            
         
                    
        
    
            
            标识
            
                                    DOI:10.1016/j.jbi.2022.104210
                                    
                                
                                 
         
        
                
            摘要
            
            Venous thromboembolism (VTE) is the world's third most common cause of vascular mortality and a serious complication from multiple departments. Risk assessment of VTE guides clinical intervention in time and is of great importance to in-hospital patients. Traditional VTE risk assessment methods based on scaling tools, which always require rules carefully designed by human experts, are difficult to apply to large-population scenarios since the manually designed rules are not guaranteed to be accurate to all populations. In contrast, with the development of the electronic health record (EHR) datasets, data-driven machine-learning-based risk assessment methods have proven superior predictability in many studies in recent years. This paper uses the gradient boosting tree model to study the VTE risk assessment problem with multi-department data. There exist two distinct characteristics of VTE data collected at the level of the entire hospital: its wide distribution and heterogeneity across multiple departments. To this end, we consider the prediction task over multiple departments as a multi-task learning process, and introduce the algorithm of a task-aware tree-based method TSGB to tackle the multi-task prediction problem. Although the introduction of multi-task learning improves overall across-department performance, we reveal the problem of task-wise performance decline while dealing with imbalanced VTE data volume. According to the analysis, we finally propose two variants of TSGB to alleviate the problems and further boost the prediction performance. Compared with state-of-the-art rule-based and multi-task tree-based methods, the experimental results show the proposed methods not only improve the overall across-department AUC performance effectively, but also ensure the improvement of performance over every single department prediction.
         
            
 
                 
                
                    
                    科研通智能强力驱动
Strongly Powered by AbleSci AI