对接(动物)
机器学习
自动停靠
试验装置
人工智能
计算机科学
训练集
化学
生物化学
医学
护理部
生物信息学
基因
作者
Lukáš Bučinský,Marián Gall,Ján Matúška,Michal Pitoňák,Marek Štekláč
摘要
Abstract Here we present three distinct machine learning (ML) approaches (TensorFlow, XGBoost, and SchNetPack) for docking score prediction. AutoDock Vina is used to evaluate the inhibitory potential of ZINC15 in‐vivo and in‐vitro‐only sets towards the SARS‐CoV‐2 main protease. The in‐vivo set (59 884 compounds) is used for ML training (max. 80%), validation (5%), and testing (15%). The in‐vitro‐only set (174 014 compounds) is used for the evaluation of prediction capability of the trained ML models. Contributions to the prediction error are analyzed with respect to compounds' charge, number of atoms, and expected inhibitory potential (docking score). Methods for the prediction error estimation of new compounds are considered, yet critically rejected. The ML input weighted with respect to the desired property (i.e., low docking score) in the machine learning models shows to be a promising option to improve the ML performance. Proposed models provide significant reduction in number of intriguing compounds that need to be investigated.
科研通智能强力驱动
Strongly Powered by AbleSci AI