分歧(语言学)
集合(抽象数据类型)
置信区间
序列(生物学)
计算生物学
虚拟筛选
训练集
人类蛋白质
药物发现
计算机科学
生物信息学
生物
人工智能
数学
统计
遗传学
基因
哲学
语言学
程序设计语言
作者
Jessica Binder,Joel Berendzen,Amy O. Stevens,Yi He,Jian Wang,Nikolay V. Dokholyan,Tudor I. Oprea
标识
DOI:10.1016/j.sbi.2022.102372
摘要
We investigate the use of confidence scores to evaluate the accuracy of a given AlphaFold (AF2) protein model for drug discovery. Prediction of accuracy is improved by not considering confidence scores below 80 due to the effects of disorder. On a set of recent crystal structures, 95% are likely to have accurate folds. Conformational discordance in the training set has a much more significant effect on accuracy than sequence divergence. We propose criteria for models and residues that are possibly useful for virtual screening. Based on these criteria, AF2 provides models for half of understudied (dark) human proteins and two-thirds of residues in those models.
科研通智能强力驱动
Strongly Powered by AbleSci AI