计算机科学
疾病
生物信息学
肝功能不全
肝病
医学
人工智能
机器学习
计算生物学
内科学
生物
作者
Yi‐Hao Yu,Yuqi Yang,Qian Li,Jing Yuan,Yuguo Zha
标识
DOI:10.1038/s41598-025-96478-6
摘要
Early and accurate identification of patients at high risk of metabolic dysfunction-associated steatotic liver disease (MASLD) is critical to prevent and improve prognosis potentially. We aimed to develop and validate an explainable prediction model based on machine learning (ML) approaches for MASLD among the adult population. The national cross-sectional study collected data from the National Health and Nutrition Examination Survey from 2017 to 2020, consisting of 13,436 participants, who were randomly split into 70% training, 20% internal validation, and 10% external validation cohorts. MASLD was defined based on transient elastography and cardiometabolic risk factors. With 50 medical characteristics easily obtained, six ML algorithms were used to develop prediction models. Several evaluation parameters were used to compare the predictive performance, including the area under the receiver-operating-characteristic curve (AUC) and precision-recall (P-R) curve. The recursive feature elimination method was applied to select the optimal feature subset. The Shapley Additive exPlanations method offered global and local explanations for the model. The random forest (RF) model performed best in discriminative ability among 6 ML models, and the optimal 10-feature RF model was finally chosen. The final model could accurately predict MASLD in internal and external validation cohorts (AUC: 0.928, 0.918; area under P-R curve: 0.876, 0.863, respectively). The final model performed better than each of the traditional risk indicators for MASLD. An explainable 10-feature prediction model with excellent discrimination and calibration performance was successfully developed and validated for MASLD based on clinical data easily extracted using an RF algorithm.
科研通智能强力驱动
Strongly Powered by AbleSci AI