SPOT-1D-Single: improving the single-sequence-based prediction of protein secondary structure, backbone angles, solvent accessibility and half-sphere exposures using a large training set and ensembled deep learning

计算机科学 集合(抽象数据类型) 卷积神经网络 热点(计算机编程) 序列(生物学) 人工智能 算法 蛋白质二级结构 深度学习 试验装置 模式识别(心理学) 蛋白质结构预测 蛋白质结构 生物系统 生物 遗传学 生物化学 程序设计语言 操作系统
作者
Jaspreet Singh,Thomas Litfin,Kuldip K. Paliwal,Jaswinder Singh,Anil Kumar Hanumanthappa,Yaoqi Zhou
出处
期刊:Bioinformatics [Oxford University Press]
卷期号:37 (20): 3464-3472 被引量:47
标识
DOI:10.1093/bioinformatics/btab316
摘要

Abstract Motivation Knowing protein secondary and other one-dimensional structural properties are essential for accurate protein structure and function prediction. As a result, many methods have been developed for predicting these one-dimensional structural properties. However, most methods relied on evolutionary information that may not exist for many proteins due to a lack of sequence homologs. Moreover, it is computationally intensive for obtaining evolutionary information as the library of protein sequences continues to expand exponentially. Here, we developed a new single-sequence method called SPOT-1D-Single based on a large training dataset of 39 120 proteins deposited prior to 2016 and an ensemble of hybrid long-short-term-memory bidirectional neural network and convolutional neural network. Results We showed that SPOT-1D-Single consistently improves over SPIDER3-Single and ProteinUnet for secondary structure, solvent accessibility, contact number and backbone angles prediction for all seven independent test sets (TEST2018, SPOT-2016, SPOT-2016-HQ, SPOT-2018, SPOT-2018-HQ, CASP12 and CASP13 free-modeling targets). For example, the predicted three-state secondary structure’s accuracy ranges from 72.12% to 74.28% by SPOT-1D-Single, compared to 69.1–72.6% by SPIDER3-Single and 70.6–73% by ProteinUnet. SPOT-1D-Single also predicts SS3 and SS8 with 6.24% and 6.98% better accuracy than SPOT-1D on SPOT-2018 proteins with no homologs (Neff = 1), respectively. The new method’s improvement over existing techniques is due to a larger training set combined with ensembled learning. Availability and implementation Standalone-version of SPOT-1D-Single is available at https://github.com/jas-preet/SPOT-1D-Single. Direct prediction can also be made at https://sparks-lab.org/server/spot-1d-single. The datasets used in this research can also be downloaded from GitHub. Supplementary information Supplementary data are available at Bioinformatics online.

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
田様应助华宇蓝采纳,获得10
刚刚
1461644768完成签到,获得积分10
刚刚
2秒前
是小越啊发布了新的文献求助10
2秒前
青岑发布了新的文献求助10
2秒前
贾不可完成签到,获得积分10
3秒前
Red-Rain发布了新的文献求助10
4秒前
4秒前
混子发布了新的文献求助10
5秒前
天麟发布了新的文献求助10
6秒前
6秒前
萨芬完成签到,获得积分10
7秒前
李1完成签到,获得积分10
8秒前
sun完成签到,获得积分10
9秒前
9秒前
10秒前
量子星尘发布了新的文献求助10
14秒前
顾末完成签到,获得积分10
16秒前
科研通AI6应助夕夕采纳,获得10
16秒前
16秒前
16秒前
L_Cheung发布了新的文献求助40
16秒前
李爱国应助是小越啊采纳,获得10
16秒前
陈全刚完成签到,获得积分10
18秒前
19秒前
20秒前
longer发布了新的文献求助10
21秒前
21秒前
冰柠檬发布了新的文献求助10
21秒前
渔秋一发布了新的文献求助10
21秒前
上官若男应助liang采纳,获得10
22秒前
24秒前
24秒前
斯文败类应助水木子尔采纳,获得10
25秒前
韭菜发布了新的文献求助10
26秒前
英俊的铭应助混子采纳,获得10
26秒前
王粒伊完成签到,获得积分10
27秒前
陈雨行发布了新的文献求助20
27秒前
28秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
List of 1,091 Public Pension Profiles by Region 1581
以液相層析串聯質譜法分析糖漿產品中活性雙羰基化合物 / 吳瑋元[撰] = Analysis of reactive dicarbonyl species in syrup products by LC-MS/MS / Wei-Yuan Wu 1000
Biology of the Reptilia. Volume 21. Morphology I. The Skull and Appendicular Locomotor Apparatus of Lepidosauria 600
The Scope of Slavic Aspect 600
Foregrounding Marking Shift in Sundanese Written Narrative Segments 600
Rousseau, le chemin de ronde 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 物理化学 基因 遗传学 催化作用 冶金 量子力学 光电子学
热门帖子
关注 科研通微信公众号,转发送积分 5540192
求助须知:如何正确求助?哪些是违规求助? 4626761
关于积分的说明 14600756
捐赠科研通 4567792
什么是DOI,文献DOI怎么找? 2504197
邀请新用户注册赠送积分活动 1481880
关于科研通互助平台的介绍 1453505