Boosted neural networks scoring functions for accurate ligand docking and ranking

对接(动物) 虚拟筛选 人工智能 计算机科学 机器学习 结合亲和力 人工神经网络 诱饵 药物发现 模式识别(心理学) 生物信息学 化学 生物 生物化学 医学 护理部 受体
作者
Hossam M. Ashtawy,Nihar R. Mahapatra
出处
期刊:Journal of Bioinformatics and Computational Biology [Imperial College Press]
卷期号:16 (02): 1850004-1850004 被引量:14
标识
DOI:10.1142/s021972001850004x
摘要

Predicting the native poses of ligands correctly is one of the most important steps towards successful structure-based drug design. Binding affinities (BAs) estimated by traditional scoring functions (SFs) are typically used to score and rank-order poses to select the most promising conformation. This BA-based approach is widely applied and some success has been reported, but it is inconsistent and still far from perfect. The main reason for this is that SFs are trained on experimental BA values of only native poses found in co-crystallized structures of protein-ligand complexes (PLCs). However, during docking, they are needed to discriminate between native and decoy poses, a task for which they have not been specifically designed. To overcome this limitation, we propose to build task-specific SFs that model binding affinities (scoring task) as well as conformations (docking task) using the root mean square deviation (RMSD) of a ligand pose from the native pose. Our models are based on boosted ensembles of neural networks and other state-of-the-art machine learning (ML) algorithms in conjunction with multi-perspective interaction modeling techniques for accurate characterization of PLCs. We assess the docking and scoring/ranking accuracies of the proposed ML SFs as well as three conventional SFs in the context of the 2014 CSAR benchmark exercise that encompasses three high-quality protein systems and a diverse set of drug-like molecules. Our proposed docking-specific SFs provide a substantial improvement in the docking task. We find that RMSD-based SFs for BsN, an ensemble neural networks (NN) model based on boosting, and six other ML models provide more than 120% improvement, on average, over their BA-based counterparts. In terms of scoring/ranking accuracy, we find that the approach of using RMSD-based BsN to select the top ligand pose followed by applying BA-based BsN to rank ligands using predicted BA scores leads to consistent and correctly ranked ligands for the two protein targets Spleen Tyrosine Kinase (SYK) and tRNA (m1G37) methyltransferase (TrmD). In addition, the ensemble NN SF BsN is at least 250% more accurate than a single neural network (SNN) model. We further find that ensemble models based on NNs surpass SFs based on other state-of-the-art ML algorithms such as BRT, RF, SVM, and [Formula: see text]NN. Finally, our RF model fitted to PLCs characterized by multiple sets of descriptors from four different sources (X-Score, AffiScore, RF-Score, and GOLD) substantially outperforms the SF RF-Score that uses only one set of features, underlining the value of multi-perspective modeling.

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
科科克尔克完成签到 ,获得积分10
刚刚
无心的秋珊完成签到 ,获得积分10
1秒前
juice完成签到 ,获得积分10
3秒前
怕黑的凝旋完成签到 ,获得积分10
4秒前
hahaha完成签到,获得积分10
7秒前
wwww完成签到 ,获得积分10
17秒前
jjj完成签到 ,获得积分10
19秒前
orchid完成签到,获得积分10
20秒前
会飞的猪完成签到,获得积分10
20秒前
小嚣张完成签到,获得积分10
25秒前
28秒前
圣飞云宇完成签到 ,获得积分10
28秒前
qwf完成签到 ,获得积分10
29秒前
JSEILWQ完成签到 ,获得积分10
31秒前
重要的炳完成签到 ,获得积分10
32秒前
Arthur Zhu发布了新的文献求助10
34秒前
35秒前
如意土豆完成签到 ,获得积分10
41秒前
不安姿完成签到 ,获得积分10
44秒前
Roy完成签到,获得积分10
45秒前
端庄代荷完成签到 ,获得积分10
48秒前
我很好完成签到 ,获得积分10
48秒前
7788完成签到,获得积分10
49秒前
dldldl完成签到,获得积分10
56秒前
几几完成签到,获得积分10
56秒前
鲁滨逊完成签到 ,获得积分10
56秒前
数乱了梨花完成签到 ,获得积分10
1分钟前
小白应助ccc采纳,获得20
1分钟前
求助完成签到,获得积分10
1分钟前
范白容完成签到 ,获得积分10
1分钟前
1分钟前
痴情的靖柔完成签到 ,获得积分10
1分钟前
1分钟前
kugaidatou发布了新的文献求助30
1分钟前
elsa622完成签到 ,获得积分10
1分钟前
李lll发布了新的文献求助10
1分钟前
烟花应助不安姿采纳,获得10
1分钟前
1分钟前
热心的飞风完成签到 ,获得积分10
1分钟前
小潘完成签到,获得积分10
1分钟前
高分求助中
Les Mantodea de Guyane: Insecta, Polyneoptera [The Mantids of French Guiana] 2500
振动分析基础 -- (美)L_米罗维奇著;上海交通大学理论力学教研室译 1000
Future Approaches to Electrochemical Sensing of Neurotransmitters 1000
生物降解型栓塞微球市场(按产品类型、应用和最终用户)- 2030 年全球预测 1000
壮语核心名词的语言地图及解释 900
盐环境来源微生物多相分类及嗜盐古菌基因 组适应性与演化研究 500
Canon of Insolation and the Ice-age Problem 380
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 内科学 计算机科学 纳米技术 复合材料 化学工程 遗传学 基因 物理化学 催化作用 冶金 量子力学 光电子学
热门帖子
关注 科研通微信公众号,转发送积分 3913946
求助须知:如何正确求助?哪些是违规求助? 3459028
关于积分的说明 10903980
捐赠科研通 3185708
什么是DOI,文献DOI怎么找? 1761091
邀请新用户注册赠送积分活动 851850
科研通“疑难数据库(出版商)”最低求助积分说明 792980