Enhancing Protein Function Prediction Performance by Utilizing AlphaFold-Predicted Protein Structures

水准点(测量) 蛋白质结构预测 计算机科学 集合(抽象数据类型) 训练集 功能(生物学) 蛋白质功能预测 性能预测 数据挖掘 机器学习 人工智能 蛋白质结构 蛋白质功能 模拟 生物 基因 大地测量学 物理 进化生物学 生物化学 化学 程序设计语言 地理 核磁共振
作者
Wenjian Ma,Shugang Zhang,Zhen Li,Mingjian Jiang,Shuang Wang,Weigang Lu,Xiangpeng Bi,Huasen Jiang,Henggui Zhang,Zhiqiang Wei
出处
期刊:Journal of Chemical Information and Modeling [American Chemical Society]
卷期号:62 (17): 4008-4017 被引量:38
标识
DOI:10.1021/acs.jcim.2c00885
摘要

The structure of a protein is of great importance in determining its functionality, and this characteristic can be leveraged to train data-driven prediction models. However, the limited number of available protein structures severely limits the performance of these models. AlphaFold2 and its open-source data set of predicted protein structures have provided a promising solution to this problem, and these predicted structures are expected to benefit the model performance by increasing the number of training samples. In this work, we constructed a new data set that acted as a benchmark and implemented a state-of-the-art structure-based approach for determining whether the performance of the function prediction model can be improved by putting additional AlphaFold-predicted structures into the training set and further compared the performance differences between two models separately trained with real structures only and AlphaFold-predicted structures only. Experimental results indicated that structure-based protein function prediction models could benefit from virtual training data consisting of AlphaFold-predicted structures. First, model performances were improved in all three categories of Gene Ontology terms (GO terms) after adding predicted structures as training samples. Second, the model trained only on AlphaFold-predicted virtual samples achieved comparable performances to the model based on experimentally solved real structures, suggesting that predicted structures were almost equally effective in predicting protein functionality.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
thuuu完成签到,获得积分10
刚刚
乐乐应助ccn采纳,获得10
刚刚
小纸人完成签到,获得积分10
刚刚
jst完成签到,获得积分10
刚刚
nancy吴完成签到 ,获得积分10
刚刚
橘子sungua完成签到,获得积分10
1秒前
ruby完成签到,获得积分10
1秒前
求助人完成签到 ,获得积分10
1秒前
此去经年完成签到 ,获得积分10
1秒前
牛马小白完成签到,获得积分10
1秒前
源源源完成签到 ,获得积分10
1秒前
温眼张完成签到,获得积分10
2秒前
宝宝完成签到,获得积分10
2秒前
落落完成签到,获得积分10
2秒前
jst发布了新的文献求助10
3秒前
nihui完成签到,获得积分10
3秒前
4秒前
刘刘完成签到 ,获得积分10
4秒前
啊啊啊啊轩完成签到,获得积分10
4秒前
睡不醒的喵完成签到,获得积分10
4秒前
无患子完成签到,获得积分10
4秒前
LIGHT完成签到,获得积分10
5秒前
5秒前
鹿静白完成签到,获得积分20
5秒前
抗氧剂完成签到,获得积分10
6秒前
6秒前
阿姨洗铁路完成签到 ,获得积分10
6秒前
银海里的玫瑰_完成签到 ,获得积分10
6秒前
菜就多练完成签到,获得积分10
6秒前
共享精神应助科研达人采纳,获得10
6秒前
痴情的寒云完成签到 ,获得积分10
7秒前
雪白的南晴完成签到,获得积分10
7秒前
7秒前
霞霞完成签到,获得积分10
7秒前
在我梦里绕完成签到,获得积分10
8秒前
会游泳的鱼完成签到,获得积分10
8秒前
高挑的抽屉完成签到,获得积分10
9秒前
Jackcaosky完成签到 ,获得积分10
9秒前
9秒前
YangSY完成签到,获得积分10
9秒前
高分求助中
Les Mantodea de Guyane Insecta, Polyneoptera 2500
Technologies supporting mass customization of apparel: A pilot project 450
A Field Guide to the Amphibians and Reptiles of Madagascar - Frank Glaw and Miguel Vences - 3rd Edition 400
Brain and Heart The Triumphs and Struggles of a Pediatric Neurosurgeon 400
Cybersecurity Blueprint – Transitioning to Tech 400
Mixing the elements of mass customisation 400
Периодизация спортивной тренировки. Общая теория и её практическое применение 310
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3784869
求助须知:如何正确求助?哪些是违规求助? 3330170
关于积分的说明 10244733
捐赠科研通 3045558
什么是DOI,文献DOI怎么找? 1671716
邀请新用户注册赠送积分活动 800631
科研通“疑难数据库(出版商)”最低求助积分说明 759577