单倍型
遗传学
生物
基因组
单核苷酸多态性
参考基因组
人口
串联重复
遗传变异
全基因组关联研究
杂合子丢失
1000基因组计划
微卫星
计算生物学
基因型
进化生物学
基因
等位基因
医学
环境卫生
作者
Helyaneh Ziaei Jam,Yang Li,Ross DeVito,Nima Mousavi,Nichole Ma,Ibra Lujumba,Yagoub Adam,Mikhail O. Maksimov,Bonnie Huang,Egor Dolzhenko,Yunjiang Qiu,Fredrick Elishama Kakembo,Habi Joseph,Blessing Onyido,Jumoke Adeyemi,Mehrdad Bakhtiari,Jonghun Park,Sara Javadzadeh,Daudi Jjingo,Marion O. Adebiyi
标识
DOI:10.1038/s41467-023-42278-3
摘要
Abstract Tandem repeats (TRs) represent one of the largest sources of genetic variation in humans and are implicated in a range of phenotypes. Here we present a deep characterization of TR variation based on high coverage whole genome sequencing from 3550 diverse individuals from the 1000 Genomes Project and H3Africa cohorts. We develop a method, EnsembleTR, to integrate genotypes from four separate methods resulting in high-quality genotypes at more than 1.7 million TR loci. Our catalog reveals novel sequence features influencing TR heterozygosity, identifies population-specific trinucleotide expansions, and finds hundreds of novel eQTL signals. Finally, we generate a phased haplotype panel which can be used to impute most TRs from nearby single nucleotide polymorphisms (SNPs) with high accuracy. Overall, the TR genotypes and reference haplotype panel generated here will serve as valuable resources for future genome-wide and population-wide studies of TRs and their role in human phenotypes.
科研通智能强力驱动
Strongly Powered by AbleSci AI