基因分型
生命银行
插补(统计学)
一致性
SNP基因分型
单核苷酸多态性
SNP阵列
生物
遗传学
计算生物学
dbSNP公司
基因组
计算机科学
基因型
基因
缺少数据
机器学习
作者
Zijian Tian,Fei Chen,Jing Wang,Benrui Wu,Jian Shao,Ziqing Liu,Zheng Li,You Wang,Tao Xu,Kaixin Zhou
标识
DOI:10.1093/pcmedi/pbad002
摘要
Abstract Background Chronic diseases are becoming a critical challenge to the aging Chinese population. Biobanks with extensive genomic and environmental data offer opportunities to elucidate the complex gene–environment interactions underlying their aetiology. Genome-wide genotyping array remains an efficient approach for large-scale genomic data collection. However, most commercial arrays have reduced performance for biobanking in the Chinese population. Materials and methods Deep whole-genome sequencing data from 2 641 Chinese individuals were used as a reference to develop the CAS array, a custom-designed genotyping array for precision medicine. Evaluation of the array was performed by comparing data from 384 individuals assayed both by the array and whole-genome sequencing. Validation of its mitochondrial copy number estimating capacity was conducted by examining its association with established covariates among 10 162 Chinese elderly. Results The CAS Array adopts the proven Axiom technology and is restricted to 652 429 single-nucleotide polymorphism (SNP) markers. Its call rate of 99.79% and concordance rate of 99.89% are both higher than for commercial arrays. Its imputation-based genome coverage reached 98.3% for common SNPs and 63.0% for low-frequency SNPs, both comparable to commercial arrays with larger SNP capacity. After validating its mitochondrial copy number estimates, we developed a publicly available software tool to facilitate the array utility. Conclusion Based on recent advances in genomic science, we designed and implemented a high-throughput and low-cost genotyping array. It is more cost-effective than commercial arrays for large-scale Chinese biobanking.
科研通智能强力驱动
Strongly Powered by AbleSci AI