Alignment-Free Antimicrobial Peptide Predictors: Improving Performance by a Thorough Analysis of the Largest Available Data Set

标杆管理 集合(抽象数据类型) 数据集 机器学习 计算机科学 随机森林 数据挖掘 人工智能 营销 业务 程序设计语言
作者
Sergio A. Pinacho-Castellanos,César R. García‐Jacas,Michael K. Gilson,Carlos A. Brizuela
出处
期刊:Journal of Chemical Information and Modeling [American Chemical Society]
卷期号:61 (6): 3141-3157 被引量:56
标识
DOI:10.1021/acs.jcim.1c00251
摘要

In the last two decades, a large number of machine-learning-based predictors for the activities of antimicrobial peptides (AMPs) have been proposed. These predictors differ from one another in the learning method and in the training and testing data sets used. Unfortunately, the training data sets present several drawbacks, such as a low representativeness regarding the experimentally validated AMP space, and duplicated peptide sequences between negative and positive data sets. These limitations give a low confidence to most of the approaches to be used in prospective studies. To address these weaknesses, we propose novel modeling and assessing data sets from the largest experimentally validated nonredundant peptide data set reported to date. From these novel data sets, alignment-free quantitative sequence–activity models (AF-QSAMs) based on Random Forest are created to identify general AMPs and their antibacterial, antifungal, antiparasitic, and antiviral functional types. An applicability domain analysis is carried out to determine the reliability of the predictions obtained, which, to the best of our knowledge, is performed for the first time for AMP recognition. A benchmarking is undertaken between the models proposed and several models from the literature that are freely available in 13 programs (ClassAMP, iAMP-2L, ADAM, MLAMP, AMPScanner v2.0, AntiFP, AMPfun, PEPred-suite, AxPEP, CAMPR3, iAMPpred, APIN, and Meta-iAVP). The models proposed are those with the best performance in all of the endpoints modeled, while most of the methods from the literature have weak-to-random predictive agreements. The models proposed are also assessed through Y-scrambling and repeated k-fold cross-validation tests, demonstrating that the outcomes obtained by them are not given by chance. Three chemometric analyses also confirmed the relevance of the peptides descriptors used in the modeling. Therefore, it can be concluded that the models built by fixing the drawbacks existing in the literature contribute to identifying antibacterial, antifungal, antiparasitic, and antiviral peptides with high effectivity and reliability. Models are freely available via the AMPDiscover tool at https://biocom-ampdiscover.cicese.mx/.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
leo完成签到,获得积分10
2秒前
量子星尘发布了新的文献求助10
3秒前
xiaxiao完成签到,获得积分0
8秒前
WSY完成签到 ,获得积分10
10秒前
wy1693207859完成签到,获得积分10
11秒前
紫菜完成签到,获得积分10
14秒前
serenity711完成签到 ,获得积分10
16秒前
lone623完成签到 ,获得积分10
21秒前
和气生财君完成签到 ,获得积分10
23秒前
早日发文章完成签到 ,获得积分10
24秒前
量子星尘发布了新的文献求助10
24秒前
伊叶之丘完成签到 ,获得积分10
25秒前
坚强的铅笔完成签到 ,获得积分10
29秒前
幽默的忆霜完成签到 ,获得积分10
29秒前
Bgeelyu发布了新的文献求助10
32秒前
37秒前
Lexi完成签到 ,获得积分10
38秒前
46秒前
量子星尘发布了新的文献求助10
46秒前
赫连一斩发布了新的文献求助10
50秒前
直率千萍完成签到,获得积分20
56秒前
1分钟前
高高代珊完成签到 ,获得积分10
1分钟前
量子星尘发布了新的文献求助10
1分钟前
1分钟前
酷酷的山雁完成签到,获得积分10
1分钟前
直率千萍发布了新的文献求助20
1分钟前
Akim应助科研通管家采纳,获得10
1分钟前
1分钟前
1分钟前
量子星尘发布了新的文献求助10
1分钟前
迅速千愁完成签到 ,获得积分10
1分钟前
kanong完成签到,获得积分0
1分钟前
彩云追月完成签到 ,获得积分10
1分钟前
量子星尘发布了新的文献求助10
1分钟前
whuhustwit完成签到,获得积分10
1分钟前
wonwojo完成签到 ,获得积分10
1分钟前
1分钟前
激情的含巧完成签到,获得积分10
1分钟前
王佳豪完成签到,获得积分10
2分钟前
高分求助中
【提示信息,请勿应助】请使用合适的网盘上传文件 10000
The Oxford Encyclopedia of the History of Modern Psychology 1500
Green Star Japan: Esperanto and the International Language Question, 1880–1945 800
Sentimental Republic: Chinese Intellectuals and the Maoist Past 800
The Martian climate revisited: atmosphere and environment of a desert planet 800
Parametric Random Vibration 800
Building Quantum Computers 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3864034
求助须知:如何正确求助?哪些是违规求助? 3406339
关于积分的说明 10649008
捐赠科研通 3130235
什么是DOI,文献DOI怎么找? 1726356
邀请新用户注册赠送积分活动 831635
科研通“疑难数据库(出版商)”最低求助积分说明 779990