氨基酸
模式识别(心理学)
蛋白质结构
蛋白质测序
蛋白质二级结构
蛋白质结构预测
计算生物学
人工神经网络
伪氨基酸组成
蛋白质-蛋白质相互作用
作者
Mauricio Oberti,Iosif I. Vaisman
出处
期刊:Proteins
[Wiley]
日期:2020-06-14
卷期号:88 (11): 1472-1481
被引量:3
摘要
Intrinsically disordered regions (IDR) play an important role in key biological processes and are closely related to human diseases. IDRs have great potential to serve as targets for drug discovery, most notably in disordered binding regions. Accurate prediction of IDRs is challenging because their genome wide occurrence and a low ratio of disordered residues make them difficult targets for traditional classification techniques. Existing computational methods mostly rely on sequence profiles to improve accuracy which is time consuming and computationally expensive. This article describes an ab initio sequence-only prediction method-which tries to overcome the challenge of accurate prediction posed by IDRs-based on reduced amino acid alphabets and convolutional neural networks (CNNs). We experiment with six different 3-letter reduced alphabets. We argue that the dimensional reduction in the input alphabet facilitates the detection of complex patterns within the sequence by the convolutional step. Experimental results show that our proposed IDR predictor performs at the same level or outperforms other state-of-the-art methods in the same class, achieving accuracy levels of 0.76 and AUC of 0.85 on the publicly available Critical Assessment of protein Structure Prediction dataset (CASP10). Therefore, our method is suitable for proteome-wide disorder prediction yielding similar or better accuracy than existing approaches at a faster speed.
科研通智能强力驱动
Strongly Powered by AbleSci AI