结核分枝杆菌
机器学习
肺结核
人工智能
随机森林
试验装置
梯度升压
计算机科学
人工神经网络
支持向量机
集合(抽象数据类型)
鉴定(生物学)
传染病(医学专业)
Boosting(机器学习)
生物
药物发现
虚拟筛选
疾病
生物信息学
医学
病理
植物
程序设计语言
作者
Qing Ye,Xin Chai,Dejun Jiang,Liu Yang,Chao Shen,Xujun Zhang,Dan Li,Dong-Sheng Cao,Tingjun Hou
摘要
Tuberculosis (TB) is an infectious disease caused by Mycobacterium tuberculosis (Mtb) and it has been one of the top 10 causes of death globally. Drug-resistant tuberculosis (XDR-TB), extensively resistant to the commonly used first-line drugs, has emerged as a major challenge to TB treatment. Hence, it is quite necessary to discover novel drug candidates for TB treatment. In this study, based on different types of molecular representations, four machine learning (ML) algorithms, including support vector machine, random forest (RF), extreme gradient boosting (XGBoost) and deep neural networks (DNN), were used to develop classification models to distinguish Mtb inhibitors from noninhibitors. The results demonstrate that the XGBoost model exhibits the best prediction performance. Then, two consensus strategies were employed to integrate the predictions from multiple models. The evaluation results illustrate that the consensus model by stacking the RF, XGBoost and DNN predictions offers the best predictions with area under the receiver operating characteristic curve of 0.842 and 0.942 for the 10-fold cross-validated training set and external test set, respectively. Besides, the association between the important descriptors and the bioactivities of molecules was interpreted by using the Shapley additive explanations method. Finally, an online webserver called ChemTB (http://cadd.zju.edu.cn/chemtb/) was developed, and it offers a freely available computational tool to detect potential Mtb inhibitors.
科研通智能强力驱动
Strongly Powered by AbleSci AI