计算生物学
基因组
计算机科学
物候学
新颖性
Lasso(编程语言)
生物
人工智能
基因组学
机器学习
基因
遗传学
神学
万维网
哲学
作者
Jonathan I. Tietz,Christopher J. Schwalen,Parth Patel,Tucker Maxson,Patricia M. Blair,Hua-Chia Tai,Uzma I. Zakai,Douglas A. Mitchell
标识
DOI:10.1038/nchembio.2319
摘要
Ribosomally synthesized and post-translationally modified peptide (RiPP) natural products are attractive for genome-driven discovery and re-engineering, but limitations in bioinformatic methods and exponentially increasing genomic data make large-scale mining of RiPP data difficult. We report RODEO (Rapid ORF Description and Evaluation Online), which combines hidden-Markov-model-based analysis, heuristic scoring, and machine learning to identify biosynthetic gene clusters and predict RiPP precursor peptides. We initially focused on lasso peptides, which display intriguing physicochemical properties and bioactivities, but their hypervariability renders them challenging prospects for automated mining. Our approach yielded the most comprehensive mapping to date of lasso peptide space, revealing >1,300 compounds. We characterized the structures and bioactivities of six lasso peptides, prioritized based on predicted structural novelty, including one with an unprecedented handcuff-like topology and another with a citrulline modification exceptionally rare among bacteria. These combined insights significantly expand the knowledge of lasso peptides and, more broadly, provide a framework for future genome-mining efforts.
科研通智能强力驱动
Strongly Powered by AbleSci AI