单核苷酸多态性
生物
遗传学
计算生物学
人口
全基因组关联研究
遗传关联
最佳线性无偏预测
Lasso(编程语言)
基因组学
数量性状位点
R包
SNP公司
遗传变异
预测建模
计算机科学
作者
Chuanke Fu,Job van Schipstal,Mario P. L. Calus,Pascal Duenk
出处
期刊:Genetics
[Oxford University Press]
日期:2025-11-14
标识
DOI:10.1093/genetics/iyaf245
摘要
Abstract Although standard genomic prediction (GP) models such as GBLUP assume that single nucleotide polymorphisms (SNPs) contribute equally to genetic variation, some SNPs may be more informative than others because they are more closely linked to causal variants. GP models could therefore be finetuned by incorporating biological annotations. Here, we used Combined Annotation Dependent Depletion (CADD) scores, which reflect the likelihood of a genetic variant being deleterious, as prior information in genomic prediction. Our objective was to determine the benefit of using CADD scores to select or weigh SNPs in genomic prediction. We analyzed 10 traits in a dataset of 835 mice from the Diversity Outbred (DO) mouse population. For selecting or weighing SNPs, we either used the CADD scores at the exact position of SNPs (CADD-SNP), or the maximum CADD score in a predefined window around the SNPs (CADD-window). In addition, we employed five GP models (GBLUP, BayesA, BayesB, BayesC, and BayesR) to analyze different sets of selected SNPs, and a weighted GBLUP model for weighing scenarios. The results showed that selecting SNPs based on CADD-SNP did not improve prediction accuracy. In contrast, compared to using all SNPs, selecting the top 40% of SNPs based on CADD-window was the optimal scenario. This approach effectively removed non-informative SNPs and improved prediction accuracy for at least six out of 10 traits. The improvements among these traits ranged from an average of 0.014 for body weight at 10 weeks to 0.094 for bone mineral density across five GP models. Weighing (selected) SNPs based on either CADD-SNP or CADD-window had little impact on accuracy. In conclusion, using CADD-window scores to select SNPs improved prediction accuracy, but the benefit depended on the trait of interest and the GP model that was used, while using CADD scores to weigh SNPs did not improve prediction accuracy.
科研通智能强力驱动
Strongly Powered by AbleSci AI