基因组选择
最佳线性无偏预测
计算机科学
人工智能
回归
特质
支持向量机
机器学习
弹性网正则化
线性回归
特征选择
选择(遗传算法)
统计
生物
数学
遗传学
基因
基因型
单核苷酸多态性
程序设计语言
作者
Mang Liang,Tianpeng Chang,Bingxing An,Xinghai Duan,Lili Du,Xiao-Qiao Wang,Jian Miao,Lingyang Xu,Xue Gao,Lupei Zhang,Junya Li,Huijiang Gao
标识
DOI:10.3389/fgene.2021.600040
摘要
Machine learning (ML) is perhaps the most useful tool for the interpretation of large genomic datasets. However, the performance of a single machine learning method in genomic selection (GS) is currently unsatisfactory. To improve the genomic predictions, we constructed a stacking ensemble learning framework (SELF), integrating three machine learning methods, to predict genomic estimated breeding values (GEBVs). The present study evaluated the prediction ability of SELF by analyzing three real datasets, with different genetic architecture; comparing the prediction accuracy of SELF, base learners, genomic best linear unbiased prediction (GBLUP) and BayesB. For each trait, SELF performed better than base learners, which included support vector regression (SVR), kernel ridge regression (KRR) and elastic net (ENET). The prediction accuracy of SELF was, on average, 7.70% higher than GBLUP in three datasets. Except for the milk fat percentage (MFP) traits, of the German Holstein dairy cattle dataset, SELF was more robust than BayesB in all remaining traits. Therefore, we believed that SEFL has the potential to be promoted to estimate GEBVs in other animals and plants.
科研通智能强力驱动
Strongly Powered by AbleSci AI