生物
遗传学
基因座(遗传学)
单倍型
等位基因
结构变异
核苷酸多样性
人口
基因
基因组
拷贝数变化
计算生物学
人口学
社会学
作者
William S. Gibson,Oscar L. Rodriguez,Kaitlyn Shields,Catherine A Silver,Abdullah Dorgham,Matthew Emery,Gintaras Deikus,Robert Sebra,Evan E. Eichler,Ali Kashif Bashir,Melissa Smith,Corey T. Watson
标识
DOI:10.1038/s41435-022-00188-2
摘要
Immunoglobulins (IGs), crucial components of the adaptive immune system, are encoded by three genomic loci. However, the complexity of the IG loci severely limits the effective use of short read sequencing, limiting our knowledge of population diversity in these loci. We leveraged existing long read whole-genome sequencing (WGS) data, fosmid technology, and IG targeted single-molecule, real-time (SMRT) long-read sequencing (IG-Cap) to create haplotype-resolved assemblies of the IG Lambda (IGL) locus from 6 ethnically diverse individuals. In addition, we generated 10 diploid assemblies of IGL from a diverse cohort of individuals utilizing IG-Cap. From these 16 individuals, we identified significant allelic diversity, including 36 novel IGLV alleles. In addition, we observed highly elevated single nucleotide variation (SNV) in IGLV genes relative to IGL intergenic and genomic background SNV density. By comparing SNV calls between our high quality assemblies and existing short read datasets from the same individuals, we show a high propensity for false-positives in the short read datasets. Finally, for the first time, we nucleotide-resolved common 5-10 Kb duplications in the IGLC region that contain functional IGLJ and IGLC genes. Together these data represent a significant advancement in our understanding of genetic variation and population diversity in the IGL locus.
科研通智能强力驱动
Strongly Powered by AbleSci AI