基因
计算生物学
遗传学
生物
微生物遗传学
生物技术
作者
Yancong Zhang,Amrisha Bhosle,Sena Bae,Kelly Eckenrode,Xueying Huang,Jingjing Tang,Danylo Lavrentovich,Lana Awad,Hua Ji,Ya Wang,Xochitl C. Morgan,Bin Li,Andy Krueger,Wendy S. Garrett,Eric A. Franzosa,Curtis Huttenhower
标识
DOI:10.1038/s41587-025-02813-7
摘要
Abstract The majority of genes in microbial communities remain uncharacterized. Here we develop a method to infer putative function for microbial proteins at scale by assessing community-wide multiomics data. We predict high-confidence functions for >443,000 protein families (~82.3% previously uncharacterized), including >27,000 protein families with weak homology to known proteins and >6,000 protein families without homology. These were drawn from 1,595 gut metagenomes and 800 metatranscriptomes from the Integrative Human Microbiome Project (HMP2/iHMP). Integrating additional information such as sequence similarity, genomic proximity and domain–domain interactions improves performance of the method. Our method’s implementation, FUGAsseM, is generalizable and predicts protein function in both well-studied and undercharacterized communities. FUGAsseM achieves similar levels of accuracy in the context of microbial communities when compared to state-of-the-art approaches designed for application to single organisms while simultaneously providing much greater breadth of coverage. This initial study expands the functional landscape of the human gut microbiome and allows for exploration of microbial proteins in undercharacterized communities.
科研通智能强力驱动
Strongly Powered by AbleSci AI