StackACPred: Prediction of anticancer peptides by integrating optimized multiple feature descriptors with stacked ensemble approach

计算机科学判别式支持向量机人工智能伪氨基酸组成特征（语言学）集合预报机器学习模式识别（心理学）数据挖掘肽化学语言学生物化学哲学二肽

作者

Muhammad Arif,Saeed Ahmed,Fang Ge,Muhammad Kabir,Yaser Daanial Khan,Dong‐Jun Yu,Maha A. Thafar

出处

期刊：Chemometrics and Intelligent Laboratory Systems [Elsevier BV]
日期：2021-11-17 卷期号：220: 104458-104458 被引量：54

标识

DOI：10.1016/j.chemolab.2021.104458

摘要

Anticancer peptides (ACPs) have been emerged as a potential safe therapeutic agent for treating cancer. Identifying novel ACPs is crucial for understanding deep insight their functional mechanisms and vaccine production. Conventional wet-lab technological methods for finding ACPs are overpriced, slow, and resource-intensive. Thus, fast and accurate ACPs prediction through computational approach is highly desired because of massive peptide sequences accumulated in the post-genomic era. Recently, several intelligent statistical approaches have been designed for discriminating ACPs from non-ACPs. Although remarkable achievements have been accomplished, available methods still have inadequate feature descriptors and learning algorithms, thereby restricting the predictive performance. To address this, we develop a novel predictor called Stack-ACPred for the correct identification of ACPs. More specifically, the proposed method possesses three nominal feature encoding strategies i.e., evolutionary-profile and physicochemical information as segmented position-specific scoring matrix (SegPSSM), pseudo (PsePSSM), and extended pseudo amino acid composition (PseAAC). The extracted features are serially fused and further optimized through a powerful support vector machine recursive feature elimination and correlation bias reduction (SVM-RFE + CBR) algorithm. The optimal selected attributes are provided to build the stacking-base ensemble model for targeting effective ACPs. The proposed StackACPred attained 84.45% and 86.21% accuracy based on ACP740 and ACP240 datasets with 5-fold cross-validation test, which was 2.97% and 0.79% higher than other existing studies, respectively. The empirical outcomes of our developed automated tool demonstrate the excellent discriminative power for annotating large scale ACPs in particular and other peptides in general. • We developed an intelligent predictor named StackACPred for correct identification of ACPs. • Three nominal feature encoding strategies on the bases of evolutionary-profile and physicochemical information as: N-Segmentation position-specific scoring matrix (N-SegPSSM), pseudo (PsePSSM), and extended pseudo amino acid composition (PseAAC). • Powerful support vector machine recursive feature elimination and correlation bias reduction (SVM-RFE + CBR) algorithm was used to select the optimal features. • LightGMB and stacking-base ensemble classifiers were used for predicting ACPs with k-fold cross-validation test. • StackACPred produced better results than others state-of-the-art predictors.

求助该文献

StackACPred: Prediction of anticancer peptides by integrating optimized multiple feature descriptors with stacked ensemble approach

今日热心研友