计算机科学
集合(抽象数据类型)
卷积神经网络
热点(计算机编程)
序列(生物学)
人工智能
算法
蛋白质二级结构
深度学习
试验装置
模式识别(心理学)
蛋白质结构预测
蛋白质结构
生物系统
生物
遗传学
生物化学
程序设计语言
操作系统
作者
Jaspreet Singh,Thomas Litfin,Kuldip K. Paliwal,Jaswinder Singh,Anil Kumar Hanumanthappa,Yaoqi Zhou
出处
期刊:Bioinformatics
[Oxford University Press]
日期:2021-04-26
卷期号:37 (20): 3464-3472
被引量:47
标识
DOI:10.1093/bioinformatics/btab316
摘要
Abstract Motivation Knowing protein secondary and other one-dimensional structural properties are essential for accurate protein structure and function prediction. As a result, many methods have been developed for predicting these one-dimensional structural properties. However, most methods relied on evolutionary information that may not exist for many proteins due to a lack of sequence homologs. Moreover, it is computationally intensive for obtaining evolutionary information as the library of protein sequences continues to expand exponentially. Here, we developed a new single-sequence method called SPOT-1D-Single based on a large training dataset of 39 120 proteins deposited prior to 2016 and an ensemble of hybrid long-short-term-memory bidirectional neural network and convolutional neural network. Results We showed that SPOT-1D-Single consistently improves over SPIDER3-Single and ProteinUnet for secondary structure, solvent accessibility, contact number and backbone angles prediction for all seven independent test sets (TEST2018, SPOT-2016, SPOT-2016-HQ, SPOT-2018, SPOT-2018-HQ, CASP12 and CASP13 free-modeling targets). For example, the predicted three-state secondary structure’s accuracy ranges from 72.12% to 74.28% by SPOT-1D-Single, compared to 69.1–72.6% by SPIDER3-Single and 70.6–73% by ProteinUnet. SPOT-1D-Single also predicts SS3 and SS8 with 6.24% and 6.98% better accuracy than SPOT-1D on SPOT-2018 proteins with no homologs (Neff = 1), respectively. The new method’s improvement over existing techniques is due to a larger training set combined with ensembled learning. Availability and implementation Standalone-version of SPOT-1D-Single is available at https://github.com/jas-preet/SPOT-1D-Single. Direct prediction can also be made at https://sparks-lab.org/server/spot-1d-single. The datasets used in this research can also be downloaded from GitHub. Supplementary information Supplementary data are available at Bioinformatics online.
科研通智能强力驱动
Strongly Powered by AbleSci AI