分类器(UML)
人工智能
支持向量机
集成学习
计算机科学
机器学习
文字2vec
模式识别(心理学)
化学
嵌入
作者
Hongqi Zhang,Shanghua Liu,Rui Li,Jun-Wen Yu,Dong-Xin Ye,Shi-Shi Yuan,Hao Lin,Huang Cheng-bing,Hua Tang
出处
期刊:ACS omega
[American Chemical Society]
日期:2024-02-08
卷期号:9 (7): 8439-8447
被引量:18
标识
DOI:10.1021/acsomega.3c09587
摘要
In biological organisms, metal ion-binding proteins participate in numerous metabolic activities and are closely associated with various diseases. To accurately predict whether a protein binds to metal ions and the type of metal ion-binding protein, this study proposed a classifier named MIBPred. The classifier incorporated advanced Word2Vec technology from the field of natural language processing to extract semantic features of the protein sequence language and combined them with position-specific score matrix (PSSM) features. Furthermore, an ensemble learning model was employed for the metal ion-binding protein classification task. In the model, we independently trained XGBoost, LightGBM, and CatBoost algorithms and integrated the output results through an SVM voting mechanism. This innovative combination has led to a significant breakthrough in the predictive performance of our model. As a result, we achieved accuracies of 95.13% and 85.19%, respectively, in predicting metal ion-binding proteins and their types. Our research not only confirms the effectiveness of Word2Vec technology in extracting semantic information from protein sequences but also highlights the outstanding performance of the MIBPred classifier in the problem of metal ion-binding protein types. This study provides a reliable tool and method for the in-depth exploration of the structure and function of metal ion-binding proteins.
科研通智能强力驱动
Strongly Powered by AbleSci AI