痴呆
梯度升压
Boosting(机器学习)
计算机科学
人工智能
变量(数学)
机器学习
统计
数学
随机森林
医学
数学分析
病理
疾病
作者
Seong-Eun Ryu,Dong-Hoon Shin,Kyungyong Chung
出处
期刊:IEEE Access
[Institute of Electrical and Electronics Engineers]
日期:2020-01-01
卷期号:8: 177708-177720
被引量:34
标识
DOI:10.1109/access.2020.3025553
摘要
With the development of healthcare technologies, the elderly population has grown and therefore populating ageing has emerged as a social issue. It is a cause of rise in patients with geriatric disorders, among which dementia is very fatal to the elderly's activities of daily living. In the studies on dementia risk prediction, a method using deep learning was proposed. It requires a lot of image data and much time to learn. Therefore, this study proposes a prediction model of dementia risk based on XGBoost using derived variable extraction from numericalized dementia data and hyper-parameters optimization. The proposed method extracts variable importance from typical independent variables with the use of gradient boosting and then generates derived variables. The generated derived variables are applied to variable importance analysis and thereby a Top-N group is created. Then, for achieving optimal performance in line with the data characteristics of each Top-N group, hyper-parameter tuning is conducted. With the optimized groups, XGBoost model based performance is evaluated. In addition, for the performance evaluation of the proposed model, goodness-of-fit for machine learning classification models is evaluated. According to the Top-N group performance evaluation with different numbers of derived variables, Top-20 model showed the best performance, and the optimized hyper-parameter values were eta = 0.10, gamma = 0, max_depth = 4, and min_child_weight = 1. As a result, the accuracy of the XGBoost model proposed in this study was 85.61%, and its F1-score was 79.28%. When the proposed model is compared with Decision Tree, Random Forest, SVM, and k-NN models, it has the best performance.
科研通智能强力驱动
Strongly Powered by AbleSci AI