计算机科学
初始化
图形
人工智能
卷积神经网络
机器学习
人工神经网络
节点(物理)
编码器
深度学习
数据挖掘
理论计算机科学
结构工程
操作系统
工程类
程序设计语言
作者
Yahui Long,Min Wu,Yong Liu,Yuan Fang,Chee Keong Kwoh,Jinmiao Chen,Jiawei Luo,Xiaoli Li
出处
期刊:Bioinformatics
[Oxford University Press]
日期:2022-02-14
卷期号:38 (8): 2254-2262
被引量:54
标识
DOI:10.1093/bioinformatics/btac100
摘要
Graphs or networks are widely utilized to model the interactions between different entities (e.g. proteins, drugs, etc.) for biomedical applications. Predicting potential interactions/links in biomedical networks is important for understanding the pathological mechanisms of various complex human diseases, as well as screening compound targets for drug discovery. Graph neural networks (GNNs) have been utilized for link prediction in various biomedical networks, which rely on the node features extracted from different data sources, e.g. sequence, structure and network data. However, it is challenging to effectively integrate these data sources and automatically extract features for different link prediction tasks.In this article, we propose a novel Pre-Training Graph Neural Networks-based framework named PT-GNN to integrate different data sources for link prediction in biomedical networks. First, we design expressive deep learning methods [e.g. convolutional neural network and graph convolutional network (GCN)] to learn features for individual nodes from sequence and structure data. Second, we further propose a GCN-based encoder to effectively refine the node features by modelling the dependencies among nodes in the network. Third, the node features are pre-trained based on graph reconstruction tasks. The pre-trained features can be used for model initialization in downstream tasks. Extensive experiments have been conducted on two critical link prediction tasks, i.e. synthetic lethality (SL) prediction and drug-target interaction (DTI) prediction. Experimental results demonstrate PT-GNN outperforms the state-of-the-art methods for SL prediction and DTI prediction. In addition, the pre-trained features benefit improving the performance and reduce the training time of existing models.Python codes and dataset are available at: https://github.com/longyahui/PT-GNN.Supplementary data are available at Bioinformatics online.
科研通智能强力驱动
Strongly Powered by AbleSci AI