计算机科学
计算生物学
水准点(测量)
补语(音乐)
鉴定(生物学)
机器学习
人工智能
节点(物理)
生物
基因
工程类
生态学
遗传学
结构工程
表型
大地测量学
互补
地理
作者
Chenping Lei,Kewei Zhou,Jingyan Zheng,M. Zhao,Yan Huang,Huaqin He,Shiping Yang,Ziding Zhang
标识
DOI:10.1021/acs.jproteome.3c00364
摘要
Plant–pathogen protein–protein interactions (PPIs) play crucial roles in the arm race between plants and pathogens. Therefore, the identification of these interspecies PPIs is very important for the mechanistic understanding of pathogen infection and plant immunity. Computational prediction methods can complement experimental efforts, but their predictive performance still needs to be improved. Motivated by the rapid development of natural language processing and its successful applications in the field of protein bioinformatics, here we present an improved XGBoost-based plant–pathogen PPI predictor (i.e., AraPathogen2.0), in which sequence encodings from the pretrained protein language model ESM2 and Arabidopsis PPI network-related node representations from the graph embedding technique struc2vec are used as input. Stringent benchmark experiments showed that AraPathogen2.0 could achieve a better performance than its precedent version, especially for processing the test data set with novel proteins unseen in the training data.
科研通智能强力驱动
Strongly Powered by AbleSci AI