计算机科学
水准点(测量)
图形
嵌入
人工智能
卷积神经网络
代表(政治)
深度学习
编码(内存)
机器学习
模式识别(心理学)
作者
Hailong Jin,Wei Du,Jiawei Gu,Tianhao Zhang,Xiaohu Shi
标识
DOI:10.1109/bibm52615.2021.9669366
摘要
Protein secondary structure prediction is still a challenging task in bioinformatics, especially for 8-state (Q8) classification. To address this problem, we have proposed a deep learning based model by integrating graph convolutional network(GCN) and bidirectional long short-term memory (Bi-LSTM) network in this paper. In the model, GCN is utilized to synthesize the information of amino acids and their interactions, while Bi-LSTM has strong ability to capture the long-range dependencies of amino acids. For sequence representation, a new protein embedding derived by ProtTrans is used instead of the traditional amino acid one-hot encoding, together with evolutionary features of PSSM and HHM profiles. Amino acid contact potential derived from SPOTContact-Helical is used to construct amino acid graph. To verify the effectiveness of our proposed model, it is applied to several benchmark datasets, and obtained 78.05%, 76.81% 72.84%, 74.46% and 76.04% Q8 accuracy on CASP10, CASP11, CASP12, CB513 and TS115 datasets, respectively. Compared with 8 state-of-the-art competitions, our model obtained the best performance in most of datasets.
科研通智能强力驱动
Strongly Powered by AbleSci AI