插补(统计学)
杠杆(统计)
生物
1000基因组计划
全基因组关联研究
基因分型
SNP公司
缺少数据
次等位基因频率
计算机科学
等位基因频率
计算生物学
数据挖掘
遗传学
单核苷酸多态性
人工智能
机器学习
基因
基因型
作者
Simone Rubinacci,Diogo M. Ribeiro,Robin J. Hofmeister,Olivier Delaneau
出处
期刊:Nature Genetics
[Nature Portfolio]
日期:2021-01-01
卷期号:53 (1): 120-126
被引量:296
标识
DOI:10.1038/s41588-020-00756-0
摘要
Low-coverage whole-genome sequencing followed by imputation has been proposed as a cost-effective genotyping approach for disease and population genetics studies. However, its competitiveness against SNP arrays is undermined because current imputation methods are computationally expensive and unable to leverage large reference panels. Here, we describe a method, GLIMPSE, for phasing and imputation of low-coverage sequencing datasets from modern reference panels. We demonstrate its remarkable performance across different coverages and human populations. GLIMPSE achieves imputation of a genome for less than US$1 in computational cost, considerably outperforming other methods and improving imputation accuracy over the full allele frequency range. As a proof of concept, we show that 1× coverage enables effective gene expression association studies and outperforms dense SNP arrays in rare variant burden tests. Overall, this study illustrates the promising potential of low-coverage imputation and suggests a paradigm shift in the design of future genomic studies.
科研通智能强力驱动
Strongly Powered by AbleSci AI