全基因组关联研究
表达数量性状基因座
数量性状位点
特质
计算生物学
基因调控网络
生物
差异(会计)
遗传关联
基因组
协方差
基因
统计能力
遗传学
统计模型
进化生物学
计算机科学
基因表达
统计
数学
机器学习
单核苷酸多态性
会计
基因型
业务
程序设计语言
作者
Natália Ružičková,Michal Hledík,Gašper Tkačik
标识
DOI:10.1101/2024.02.01.578486
摘要
Abstract As their statistical power grows, genome-wide association studies (GWAS) have identified an increasing number of loci underlying quantitative traits of interest. These loci are scattered throughout the genome and are individually responsible only for small fractions of the total heritable trait variance. The recently proposed omnigenic model provides a conceptual framework to explain these observations by postulating that numerous distant loci contribute to each complex trait via effect propagation through intracellular regulatory networks. We formalize this conceptual framework by proposing the “quantitative omnigenic model” (QOM), a statistical model that combines prior knowledge of the regulatory network topology with genomic data. By applying our model to gene expression traits in yeast, we demonstrate that QOM achieves similar gene expression prediction performance to traditional GWAS with hundreds of times less parameters, while simultaneously extracting candidate causal and quantitative chains of effect propagation through the regulatory network for every individual gene. We estimate the fraction of heritable trait variance in cis- and in trans- , break the latter down by effect propagation order, assess the trans- variance not attributable to transcriptional regulation, and show that QOM correctly accounts for the low-dimensional structure of gene expression covariance. We furthermore demonstrate the relevance of QOM for systems biology, by employing it as a statistical test for the quality of regulatory network reconstructions, and linking it to the propagation of non-transcriptional (including environmental) effects. Significance statement Genetic variation leads to differences in traits implicated in health and disease. Identifying genetic variants associated with these traits is spearheaded by “genome-wide association studies” (GWAS) – statistically rigorous procedures whose power has grown with the number of genotyped samples. Nevertheless, GWAS have a substantial shortcoming: they are ill-equipped to detect the causal basis and reveal the complex systemic mechanisms of polygenic traits. Even a single genetic change can propagate throughout the entire genetic regulatory network causing a myriad of spurious detections, thereby significantly limiting GWAS usefulness. To this end, we propose a novel statistical approach that incorporates known regulatory network information with the potential to boost interpretability of state-of-the-art genomic analyses while simultaneously extracting systems biology insights.
科研通智能强力驱动
Strongly Powered by AbleSci AI