Genomic prediction using mCADD scores as prior information in a mouse population

单核苷酸多态性生物遗传学计算生物学人口全基因组关联研究遗传关联最佳线性无偏预测 Lasso（编程语言）基因组学数量性状位点 R包 SNP公司遗传变异预测建模计算机科学

作者

Chuanke Fu,Job van Schipstal,Mario P. L. Calus,Pascal Duenk

出处

期刊：Genetics [Oxford University Press]
日期：2025-11-14

链接

nih.govdoi.org

标识

DOI：10.1093/genetics/iyaf245

摘要

Abstract Although standard genomic prediction (GP) models such as GBLUP assume that single nucleotide polymorphisms (SNPs) contribute equally to genetic variation, some SNPs may be more informative than others because they are more closely linked to causal variants. GP models could therefore be finetuned by incorporating biological annotations. Here, we used Combined Annotation Dependent Depletion (CADD) scores, which reflect the likelihood of a genetic variant being deleterious, as prior information in genomic prediction. Our objective was to determine the benefit of using CADD scores to select or weigh SNPs in genomic prediction. We analyzed 10 traits in a dataset of 835 mice from the Diversity Outbred (DO) mouse population. For selecting or weighing SNPs, we either used the CADD scores at the exact position of SNPs (CADD-SNP), or the maximum CADD score in a predefined window around the SNPs (CADD-window). In addition, we employed five GP models (GBLUP, BayesA, BayesB, BayesC, and BayesR) to analyze different sets of selected SNPs, and a weighted GBLUP model for weighing scenarios. The results showed that selecting SNPs based on CADD-SNP did not improve prediction accuracy. In contrast, compared to using all SNPs, selecting the top 40% of SNPs based on CADD-window was the optimal scenario. This approach effectively removed non-informative SNPs and improved prediction accuracy for at least six out of 10 traits. The improvements among these traits ranged from an average of 0.014 for body weight at 10 weeks to 0.094 for bone mineral density across five GP models. Weighing (selected) SNPs based on either CADD-SNP or CADD-window had little impact on accuracy. In conclusion, using CADD-window scores to select SNPs improved prediction accuracy, but the benefit depended on the trait of interest and the GP model that was used, while using CADD scores to weigh SNPs did not improve prediction accuracy.

求助该文献

Genomic prediction using mCADD scores as prior information in a mouse population

今日热心研友