水准点(测量)
计算机科学
蛋白质测序
深度学习
人工智能
编码
计算生物学
蛋白质功能
注释
人工神经网络
序列(生物学)
功能(生物学)
机器学习
蛋白质功能预测
生物
模式识别(心理学)
肽序列
基因
遗传学
地理
大地测量学
作者
Fuhao Zhang,Hong Song,Min Zeng,Yaohang Li,Lukasz Kurgan,Min Li
出处
期刊:Proteomics
[Wiley]
日期:2019-05-27
卷期号:19 (12)
被引量:74
标识
DOI:10.1002/pmic.201900019
摘要
Abstract Annotation of protein functions plays an important role in understanding life at the molecular level. High‐throughput sequencing produces massive numbers of raw proteins sequences and only about 1% of them have been manually annotated with functions. Experimental annotations of functions are expensive, time‐consuming and do not keep up with the rapid growth of the sequence numbers. This motivates the development of computational approaches that predict protein functions. A novel deep learning framework, DeepFunc, is proposed which accurately predicts protein functions from protein sequence‐ and network‐derived information. More precisely, DeepFunc uses a long and sparse binary vector to encode information concerning domains, families, and motifs collected from the InterPro tool that is associated with the input protein sequence. This vector is processed with two neural layers to obtain a low‐dimensional vector which is combined with topological information extracted from protein–protein interactions (PPIs) and functional linkages. The combined information is processed by a deep neural network that predicts protein functions. DeepFunc is empirically and comparatively tested on a benchmark testing dataset and the Critical Assessment of protein Function Annotation algorithms (CAFA) 3 dataset. The experimental results demonstrate that DeepFunc outperforms current methods on the testing dataset and that it secures the highest F max = 0.54 and AUC = 0.94 on the CAFA3 dataset.
科研通智能强力驱动
Strongly Powered by AbleSci AI