中枢神经系统
Boosting(机器学习)
梯度升压
树(集合论)
神经科学
生物
医学
计算机科学
人工智能
数学
随机森林
数学分析
作者
Hong Wang,Pan Zhang,Stephen J. Barigye,James R. Empfield,Steven S. Wesolowski
标识
DOI:10.1080/17568919.2025.2557178
摘要
To develop a machine learning (ML) model for early-stage prediction of human half-life of oral central nervous system (CNS) drugs and to establish a curated dataset, including key in vitro and in vivo data, to support future modeling efforts. Human and rat half-life, plasma protein binding (PPB), and liver microsomal clearance (LM) data for 76 diverse CNS drugs and candidates were obtained from public sources or evaluated at WuXi AppTec. Gradient tree boosting (GTB) models were constructed using ChemAxon's Trainer Engine. Feature importance was assessed, and model performance was evaluated on an external validation set. The best-performing model achieved 82.4% of predictions within two-fold of observed values, with a coefficient of determination (R2) of 0.75 and a root mean square error (RMSE) of 0.25. Good generalizability was confirmed using similarity-based data splitting and Y-randomization. Integration of in vitro features, preclinical in vivo data, and physicochemical properties substantially improved predictive performance. Key features driving accurate human half-life prediction were identified. This model demonstrates practical applications for early-stage prediction of human half-life and prioritization of CNS drug candidates. The curated dataset offers a valuable resource to enhance internal databases and advance more robust predictive models.
科研通智能强力驱动
Strongly Powered by AbleSci AI