插补(统计学)
系谱图
人口
次等位基因频率
1000基因组计划
生物
全基因组关联研究
统计
遗传学
缺少数据
等位基因频率
基因型
数学
单核苷酸多态性
人口学
基因
社会学
作者
Ehsan Ullah,Raghvendra Mall,Mostafa Abbas,Khalid Kunji,Alejandro Q. Nato,Halima Bensmail,Ellen M. Wijsman,Mohamad Saad
出处
期刊:Genome Research
[Cold Spring Harbor Laboratory]
日期:2018-12-04
卷期号:29 (1): 125-134
被引量:26
标识
DOI:10.1101/gr.236315.118
摘要
Genotype imputation is widely used in genome-wide association studies to boost variant density, allowing increased power in association testing. Many studies currently include pedigree data due to increasing interest in rare variants coupled with the availability of appropriate analysis tools. The performance of population-based (subjects are unrelated) imputation methods is well established. However, the performance of family- and population-based imputation methods on family data has been subject to much less scrutiny. Here, we extensively compare several family- and population-based imputation methods on family data of large pedigrees with both European and African ancestry. Our comparison includes many widely used family- and population-based tools and another method, Ped_Pop, which combines family- and population-based imputation results. We also compare four subject selection strategies for full sequencing to serve as the reference panel for imputation: GIGI-Pick, ExomePicks, PRIMUS, and random selection. Moreover, we compare two imputation accuracy metrics: the Imputation Quality Score and Pearson's correlation R 2 for predicting power of association analysis using imputation results. Our results show that (1) GIGI outperforms Merlin; (2) family-based imputation outperforms population-based imputation for rare variants but not for common ones; (3) combining family- and population-based imputation outperforms all imputation approaches for all minor allele frequencies; (4) GIGI-Pick gives the best selection strategy based on the R 2 criterion; and (5) R 2 is the best measure of imputation accuracy. Our study is the first to extensively evaluate the imputation performance of many available family- and population-based tools on the same family data and provides guidelines for future studies.
科研通智能强力驱动
Strongly Powered by AbleSci AI