特征选择
计算机科学
特征(语言学)
人工智能
规范化(社会学)
特征学习
代表(政治)
冠状病毒
机器学习
模式识别(心理学)
数据挖掘
2019年冠状病毒病(COVID-19)
医学
哲学
语言学
疾病
病理
政治
政治学
传染病(医学专业)
法学
社会学
人类学
作者
Shouzhi Chen,Yanhong Liao,Jianping Zhao,Yannan Bin,Chun-Hou Zheng
标识
DOI:10.1109/tcbb.2023.3238370
摘要
Due to the global outbreak of COVID-19 and its variants, antiviral peptides with anti-coronavirus activity (ACVPs) represent a promising new drug candidate for the treatment of coronavirus infection. At present, several computational tools have been developed to identify ACVPs, but the overall prediction performance is still not enough to meet the actual therapeutic application. In this study, we constructed an efficient and reliable prediction model PACVP (Prediction of Anti-CoronaVirus Peptides) for identifying ACVPs based on effective feature representation and a two-layer stacking learning framework. In the first layer, we use nine feature encoding methods with different feature representation angles to characterize the rich sequence information and fuse them into a feature matrix. Secondly, data normalization and unbalanced data processing are carried out. Next, 12 baseline models are constructed by combining three feature selection methods and four machine learning classification algorithms. In the second layer, we input the optimal probability features into the logistic regression algorithm (LR) to train the final model PACVP. The experiments show that PACVP achieves favorable prediction performance on independent test dataset, with ACC of 0.9208 and AUC of 0.9465. We hope that PACVP will become a useful method for identifying, annotating and characterizing novel ACVPs.
科研通智能强力驱动
Strongly Powered by AbleSci AI