插补(统计学)
人类白细胞抗原
连锁不平衡
主要组织相容性复合体
等位基因
遗传学
基因型
等位基因频率
生命银行
生物
计算生物学
单倍型
计算机科学
基因
缺少数据
抗原
机器学习
作者
Tatsuhiko Naito,Ken Suzuki,Jun Hirata,Yoichiro Kamatani,Koichi Matsuda,Tatsushi Toda,Yukinori Okada
标识
DOI:10.1038/s41467-021-21975-x
摘要
Abstract Conventional human leukocyte antigen (HLA) imputation methods drop their performance for infrequent alleles, which is one of the factors that reduce the reliability of trans-ethnic major histocompatibility complex (MHC) fine-mapping due to inter-ethnic heterogeneity in allele frequency spectra. We develop DEEP*HLA, a deep learning method for imputing HLA genotypes. Through validation using the Japanese and European HLA reference panels ( n = 1,118 and 5,122), DEEP*HLA achieves the highest accuracies with significant superiority for low-frequency and rare alleles. DEEP*HLA is less dependent on distance-dependent linkage disequilibrium decay of the target alleles and might capture the complicated region-wide information. We apply DEEP*HLA to type 1 diabetes GWAS data from BioBank Japan ( n = 62,387) and UK Biobank ( n = 354,459), and successfully disentangle independently associated class I and II HLA variants with shared risk among diverse populations (the top signal at amino acid position 71 of HLA-DRβ1; P = 7.5 × 10 −120 ). Our study illustrates the value of deep learning in genotype imputation and trans-ethnic MHC fine-mapping.
科研通智能强力驱动
Strongly Powered by AbleSci AI