Using machine learning to realize genetic site screening and genomic prediction of productive traits in pigs

人工智能 基因组选择 支持向量机 随机森林 特征选择 机器学习 梯度升压 计算机科学 弹性网正则化 特质 选择(遗传算法) 最佳线性无偏预测 生物 遗传学 基因 单核苷酸多态性 基因型 程序设计语言
作者
Tao Xiang,Tao Li,Jielin Li,Xin Li,Jia Wang
出处
期刊:The FASEB Journal [Wiley]
卷期号:37 (6)
标识
DOI:10.1096/fj.202300245r
摘要

Genomic prediction, which is based on solving linear mixed-model (LMM) equations, is the most popular method for predicting breeding values or phenotypic performance for economic traits in livestock. With the need to further improve the performance of genomic prediction, nonlinear methods have been considered as an alternative and promising approach. The excellent ability to predict phenotypes in animal husbandry has been demonstrated by machine learning (ML) approaches, which have been rapidly developed. To investigate the feasibility and reliability of implementing genomic prediction using nonlinear models, the performances of genomic predictions for pig productive traits using the linear genomic selection model and nonlinear machine learning models were compared. Then, to reduce the high-dimensional features of genome sequence data, different machine learning algorithms, including the random forest (RF), support vector machine (SVM), extreme gradient boosting (XGBoost) and convolutional neural network (CNN) algorithms, were used to perform genomic feature selection as well as genomic prediction on reduced feature genome data. All of the analyses were processed on two real pig datasets: the published PIC pig dataset and a dataset comprising data from a national pig nucleus herd in Chifeng, North China. Overall, the accuracies of predicted phenotypic performance for traits T1, T2, T3 and T5 in the PIC dataset and average daily gain (ADG) in the Chifeng dataset were higher using the ML methods than the LMM method, while those for trait T4 in the PIC dataset and total number of piglets born (TNB) in the Chifeng dataset were slightly lower using the ML methods than the LMM method. Among all the different ML algorithms, SVM was the most appropriate for genomic prediction. For the genomic feature selection experiment, the most stable and most accurate results across different algorithms were achieved using XGBoost in combination with the SVM algorithm. Through feature selection, the number of genomic markers can be reduced to 1 in 20, while the predictive performance on some traits can even be improved compared to using the full genome data. Finally, we developed a new tool that can be used to execute combined XGBoost and SVM algorithms to realize genomic feature selection and phenotypic prediction.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
1秒前
俞晓发布了新的文献求助10
1秒前
1秒前
巴山夜雨完成签到,获得积分10
2秒前
2秒前
余白发布了新的文献求助10
2秒前
长孙曼香完成签到,获得积分10
2秒前
李健应助huieqybghog采纳,获得10
3秒前
科研喵完成签到,获得积分10
4秒前
4秒前
believe杨完成签到,获得积分10
5秒前
5秒前
5秒前
lqz0103完成签到,获得积分10
6秒前
cheng完成签到,获得积分10
7秒前
yudandan@CJLU完成签到,获得积分10
8秒前
superspace发布了新的文献求助10
9秒前
baoleijia发布了新的文献求助10
9秒前
WLX完成签到 ,获得积分10
10秒前
牧童羽发布了新的文献求助10
11秒前
rarity发布了新的文献求助10
11秒前
11秒前
11秒前
ding应助coolulu采纳,获得10
12秒前
夜已深完成签到,获得积分10
13秒前
JC完成签到 ,获得积分10
13秒前
好多愚发布了新的文献求助10
15秒前
lfj1865发布了新的文献求助10
15秒前
ln1111发布了新的文献求助10
16秒前
发炎的扁桃体完成签到,获得积分10
17秒前
情怀应助baoleijia采纳,获得10
17秒前
寻寻觅觅呢应助sdnihbhew采纳,获得20
17秒前
17秒前
JQB完成签到,获得积分10
18秒前
Luxxx完成签到,获得积分10
18秒前
18秒前
orixero应助SQL采纳,获得30
20秒前
...完成签到 ,获得积分10
20秒前
默默的狄小杰完成签到 ,获得积分10
21秒前
21秒前
高分求助中
Teaching Social and Emotional Learning in Physical Education 900
Gymnastik für die Jugend 600
Chinese-English Translation Lexicon Version 3.0 500
Electronic Structure Calculations and Structure-Property Relationships on Aromatic Nitro Compounds 500
マンネンタケ科植物由来メロテルペノイド類の網羅的全合成/Collective Synthesis of Meroterpenoids Derived from Ganoderma Family 500
[Lambert-Eaton syndrome without calcium channel autoantibodies] 440
Plesiosaur extinction cycles; events that mark the beginning, middle and end of the Cretaceous 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 有机化学 工程类 生物化学 纳米技术 物理 内科学 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 电极 光电子学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 2384975
求助须知:如何正确求助?哪些是违规求助? 2091720
关于积分的说明 5260595
捐赠科研通 1818718
什么是DOI,文献DOI怎么找? 907078
版权声明 559114
科研通“疑难数据库(出版商)”最低求助积分说明 484518