计算机科学
支持向量机
图形
随机森林
机器学习
人工智能
稳健性(进化)
卷积神经网络
特征(语言学)
特征学习
理论计算机科学
化学
生物化学
基因
语言学
哲学
作者
Hongliang Zhou,Rik Sarkar
标识
DOI:10.1101/2023.11.13.566879
摘要
Abstract Moonlighting proteins (MPs) play critical roles in cellular functions and disease, yet their complex nature challenges traditional identification methods. This research delves into the potential of Graph Machine Learning (GML) to effectively utilize Protein-Protein Interaction (PPI) networks and physicochemical properties for MP prediction. Our study focuses on two GML architectures: Graph Attention Networks (GAT) and Graph Convolutional Networks (GCN), which are augmented by integrating quasi-sequence-order (QSOrder) and amino acid composition (AAC) to enrich the feature representation. We evaluated these models on a dataset comprising 310 proteins, finding that GML substantially outperforms conventional methodologies such as Support Vector Machines (SVM), K-nearest neighbors (KNN), and Random Forest (RF), especially in scenarios with limited data where Deep Neural Networks (DNNs) typically struggle. Notably, the GAT model, when enhanced with QSOrder and PPI data, exhibited exceptional accuracy, and robustness in precision, F1-score, and ROC-AUC metrics. This indicates a significant edge in capturing the complex predictive signals embedded in PPI networks. Our findings strongly support the integration of GML in MP research and suggest a novel approach for exploring MPs through their interactive partners.
科研通智能强力驱动
Strongly Powered by AbleSci AI