作者
Jie Zhou,Hao Wu,Linge Zhang,Qiaona Zhang,Jie Wang,Hang Zhao,Yonghui Dang,Shiyu Zhang,Lu Li
摘要
The assessment of dialysis adequacy is of great clinical importance. However, it depends on the nonlinear effects of numerous confounding factors and is therefore difficult to predict using traditional statistical methods. In this study, we used Random Forest (RF), Extreme Gradient Boosting (XGBoost), and Least Absolute Shrinkage and Selection Operator Regression (LASSO) to assess dialysis adequacy. A training set (70%) and a test set (30%) were randomly selected from the 264 dialysis patient case records collected for this study. We compared the machine learning models with statistical logistic regression prediction models. In addition, we performed fivefold cross internal validation and external validation. The test dataset showed sensitivity values of 0.86 (95% CI = 0.75-0.96), 0.81 (95% CI = 0.69-0.93), and 0.72 (95% CI = 0.58-0.85) for the RF, XGBoost, and LASSO models, respectively. The matched specificity was 0.73 (95% CI = 0.58-0.87), 0.81 (95% CI = 0.67-0.93), and 0.83 (95% CI = 0.71-0.95). Accuracy was 0.80 (95% CI = 0.71-0.89), 0.81 (95% CI = 0.72-0.90), and 0.77 (95% CI = 0.68-0.86). F1 scores were 0.83 (95% CI = 0.72-0.90), 0.82 (95% CI = 0.73-0.91), and 0.78 (95% CI = 0.67-0.87). The receiver operating characteristic curves (AUROC) were 0.88 (p < 0.05, 95% CI = 0.70-0.88), 0.86 (p = 0.12, 95% CI = 0.72-0.90), and 0.88 (p < 0.05, 95% CI = 0.69-0.86). The mean absolute errors (MAE) of the calibration curves were 0.15, 0.11, and 0.07. In addition, the decision curve analysis (DCA) showed wide intervals of net clinical benefit for the models. Machine learning can be used to predict dialysis adequacy for optimal RF performance.