抓住
索引
序列(生物学)
蛋白质工程
代表(政治)
计算生物学
推论
生物
序列比对
鉴定(生物学)
多序列比对
蛋白质测序
计算机科学
遗传学
人工智能
基因
肽序列
酶
生物化学
植物
政治
基因型
单核苷酸多态性
政治学
法学
程序设计语言
作者
Gabriel Foley,Ariane Mora,Connie M. Ross,Scott Bottoms,Leander Sützl,Marnie L. Lamprecht,Julian Zaugg,Alexandra Essebier,Brad Balderson,R.B. Newell,Raine E. S. Thomson,Boštjan Kobe,Ross Barnard,Luke W. Guddat,Gerhard Schenk,Jörg Carsten,Yosephine Gumulya,Burkhard Rost,Dietmar Haltrich,Volker Sieber,Elizabeth M. J. Gillam,Mikael Bodén
标识
DOI:10.1371/journal.pcbi.1010633
摘要
Ancestral sequence reconstruction is a technique that is gaining widespread use in molecular evolution studies and protein engineering. Accurate reconstruction requires the ability to handle appropriately large numbers of sequences, as well as insertion and deletion (indel) events, but available approaches exhibit limitations. To address these limitations, we developed Graphical Representation of Ancestral Sequence Predictions (GRASP), which efficiently implements maximum likelihood methods to enable the inference of ancestors of families with more than 10,000 members. GRASP implements partial order graphs (POGs) to represent and infer insertion and deletion events across ancestors, enabling the identification of building blocks for protein engineering. To validate the capacity to engineer novel proteins from realistic data, we predicted ancestor sequences across three distinct enzyme families: glucose-methanol-choline (GMC) oxidoreductases, cytochromes P450, and dihydroxy/sugar acid dehydratases (DHAD). All tested ancestors demonstrated enzymatic activity. Our study demonstrates the ability of GRASP (1) to support large data sets over 10,000 sequences and (2) to employ insertions and deletions to identify building blocks for engineering biologically active ancestors, by exploring variation over evolutionary time.
科研通智能强力驱动
Strongly Powered by AbleSci AI