基因组
计算机科学
寄主(生物学)
鉴定(生物学)
水准点(测量)
微生物群
计算生物学
基因组
DNA测序
数据挖掘
生物
DNA
基因
生物信息学
遗传学
生态学
大地测量学
地理
作者
Caitlin Guccione,Lucas Patel,Yoshihiko Tomofuji,Daniel McDonald,Antonio González,Gregory D. Sepich‐Poore,Kyuto Sonehara,Mohsen Zakeri,Yang Chen,Amanda Hazel Dilmore,Nikhil P. Damle,Sergio E. Baranzini,George K. Hightower,Teruaki Nakatsuji,Richard L. Gallo,Ben Langmead,Yukinori Okada,Kit Curtius,Robert J. Knight
标识
DOI:10.1038/s41467-025-56077-5
摘要
Abstract As next-generation sequencing technologies produce deeper genome coverages at lower costs, there is a critical need for reliable computational host DNA removal in metagenomic data. We find that insufficient host filtration using prior human genome references can introduce false sex biases and inadvertently permit flow-through of host-specific DNA during bioinformatic analyses, which could be exploited for individual identification. To address these issues, we introduce and benchmark three host filtration methods of varying throughput, with concomitant applications across low biomass samples such as skin and high microbial biomass datasets including fecal samples. We find that these methods are important for obtaining accurate results in low biomass samples (e.g., tissue, skin). Overall, we demonstrate that rigorous host filtration is a key component of privacy-minded analyses of patient microbiomes and provide computationally efficient pipelines for accomplishing this task on large-scale datasets.
科研通智能强力驱动
Strongly Powered by AbleSci AI