Characterizing features affecting local ancestry inference performance in admixed populations

推论 进化生物学 R包 计算机科学 地理 谱系学 人工智能 生物 统计 数学 历史
作者
Jessica Honorato‐Mauer,Nirav N. Shah,Adam X. Maihofer,Clement C. Zai,Síntia Belangero,Caroline M. Nievergelt,Marcos Santoro,Elizabeth G. Atkinson
标识
DOI:10.1101/2024.08.26.609770
摘要

In recent years, significant efforts have been made to improve methods for genomic studies of admixed populations using Local Ancestry Inference (LAI). Accurate LAI is crucial to ensure downstream analyses reflect the genetic ancestry of research participants accurately. Here, we test analytic strategies for LAI to provide guidelines for optimal accuracy, focusing on admixed populations reflective of Latin America's primary continental ancestries - African (AFR), Amerindigenous (AMR), and European (EUR). Simulating LD-informed admixed haplotypes under a variety of 2 and 3-way admixture models, we implemented a standard LAI pipeline, testing three reference panel compositions to quantify their overall and ancestry-specific accuracy. We examined LAI miscall frequencies and true positive rates (TPR) across simulation models and continental ancestries. AMR tracts have notably reduced LAI accuracy as compared to EUR and AFR tracts in all comparisons, with TPR means for AMR ranging from 88-94%, EUR from 96-99% and AFR 98-99%. When LAI miscalls occurred, they most frequently erroneously called European ancestry in true Amerindigenous sites. Using a reference panel well-matched to the target population, even with a lower sample size, LAI produced true-positive estimates that were not statistically different from a high sample size but mismatched reference, while being more computationally efficient. While directly responsive to admixed Latin American cohort compositions, these trends are broadly useful for informing best practices for LAI across other admixed populations. Our findings reinforce the need for inclusion of more underrepresented populations in sequencing efforts to improve reference panels.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
d.zhang发布了新的文献求助10
1秒前
1秒前
xcy完成签到,获得积分10
1秒前
高高花瓣发布了新的文献求助10
1秒前
代代我发布了新的文献求助10
2秒前
大模型应助坚强的笑天采纳,获得10
2秒前
2秒前
蓝海湾发布了新的文献求助10
2秒前
cc77完成签到,获得积分20
3秒前
soberwind完成签到,获得积分10
3秒前
3秒前
3秒前
鸑鷟完成签到,获得积分10
3秒前
3秒前
墨琼琼发布了新的文献求助10
3秒前
3秒前
谢昊宸完成签到,获得积分10
3秒前
白华苍松发布了新的文献求助10
4秒前
Hazel完成签到,获得积分10
5秒前
周方荣发布了新的文献求助10
5秒前
5秒前
5秒前
BLLL发布了新的文献求助10
6秒前
浅忆发布了新的文献求助10
6秒前
十二发布了新的文献求助10
6秒前
喵小薇完成签到 ,获得积分10
7秒前
天天快乐应助冷傲机器猫采纳,获得10
7秒前
7秒前
ziyege完成签到,获得积分10
8秒前
8秒前
8秒前
小蘑菇发布了新的文献求助10
8秒前
8秒前
8秒前
9秒前
辛勤的沛菡完成签到,获得积分10
9秒前
困芃发布了新的文献求助10
9秒前
邓六一完成签到,获得积分10
9秒前
10秒前
高分求助中
Overcoming Stigma and Bias in Obesity Management 800
Malcolm Fraser : a biography 700
Signals, Systems, and Signal Processing 610
Materials selection in mechanical design 500
Bounds for Statistical Estimation in Semiparametric Models 500
Climate change and sports: Statistics report on climate change and sports 500
Forced degradation and stability indicating LC method for Letrozole: A stress testing guide 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6478406
求助须知:如何正确求助?哪些是违规求助? 8279986
关于积分的说明 17659237
捐赠科研通 5560730
什么是DOI,文献DOI怎么找? 2911088
邀请新用户注册赠送积分活动 1888058
关于科研通互助平台的介绍 1741844