Characterizing features affecting local ancestry inference performance in admixed populations

推论 进化生物学 R包 计算机科学 地理 谱系学 人工智能 生物 统计 数学 历史
作者
Jessica Honorato‐Mauer,Nirav N. Shah,Adam X. Maihofer,Clement C. Zai,Síntia Belangero,Caroline M. Nievergelt,Marcos Santoro,Elizabeth G. Atkinson
标识
DOI:10.1101/2024.08.26.609770
摘要

In recent years, significant efforts have been made to improve methods for genomic studies of admixed populations using Local Ancestry Inference (LAI). Accurate LAI is crucial to ensure downstream analyses reflect the genetic ancestry of research participants accurately. Here, we test analytic strategies for LAI to provide guidelines for optimal accuracy, focusing on admixed populations reflective of Latin America's primary continental ancestries - African (AFR), Amerindigenous (AMR), and European (EUR). Simulating LD-informed admixed haplotypes under a variety of 2 and 3-way admixture models, we implemented a standard LAI pipeline, testing three reference panel compositions to quantify their overall and ancestry-specific accuracy. We examined LAI miscall frequencies and true positive rates (TPR) across simulation models and continental ancestries. AMR tracts have notably reduced LAI accuracy as compared to EUR and AFR tracts in all comparisons, with TPR means for AMR ranging from 88-94%, EUR from 96-99% and AFR 98-99%. When LAI miscalls occurred, they most frequently erroneously called European ancestry in true Amerindigenous sites. Using a reference panel well-matched to the target population, even with a lower sample size, LAI produced true-positive estimates that were not statistically different from a high sample size but mismatched reference, while being more computationally efficient. While directly responsive to admixed Latin American cohort compositions, these trends are broadly useful for informing best practices for LAI across other admixed populations. Our findings reinforce the need for inclusion of more underrepresented populations in sequencing efforts to improve reference panels.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
充电宝应助112我的采纳,获得30
刚刚
刚刚
FashionBoy应助Nan采纳,获得10
5秒前
冰淇淋完成签到,获得积分10
6秒前
7秒前
8秒前
平家boy完成签到,获得积分10
9秒前
牛豁完成签到,获得积分10
9秒前
9秒前
科研女菩萨阿巴阿巴完成签到,获得积分10
9秒前
麦苗果果发布了新的文献求助10
9秒前
领导范儿应助饵丝拌辣酱采纳,获得10
10秒前
Orange应助饵丝拌辣酱采纳,获得10
10秒前
10秒前
自觉紫安完成签到 ,获得积分10
10秒前
华仔应助李以苦采纳,获得10
12秒前
12秒前
12秒前
13秒前
灵巧绿兰完成签到,获得积分20
14秒前
子车碧琴发布了新的文献求助20
14秒前
15秒前
完美世界应助小明采纳,获得10
15秒前
Dr.He发布了新的文献求助10
15秒前
7737163768发布了新的文献求助10
16秒前
17秒前
NASS1发布了新的文献求助10
17秒前
章章完成签到,获得积分10
18秒前
灵巧绿兰发布了新的文献求助10
19秒前
19秒前
jimmy发布了新的文献求助10
20秒前
liuliu发布了新的文献求助30
20秒前
科研通AI5应助whx采纳,获得10
20秒前
爆米花应助lllllan采纳,获得10
21秒前
逆水行舟发布了新的文献求助10
21秒前
橡树完成签到,获得积分10
21秒前
112我的发布了新的文献求助30
21秒前
Nan完成签到,获得积分10
22秒前
欣喜石头完成签到 ,获得积分10
23秒前
cm完成签到,获得积分10
23秒前
高分求助中
Thinking Small and Large 500
Algorithmic Mathematics in Machine Learning 500
Mapping the Stars: Celebrity, Metonymy, and the Networked Politics of Identity 400
Getting Published in SSCI Journals: 200+ Questions and Answers for Absolute Beginners 300
A monograph of the genera Conocybe and Pholiotina in Europe 200
Clinical Observation and Analysis of Transient Postoperative CA-125 Elevation in a Patient with Sigmoid Colon Adenocarcinoma 200
The direct observation of dislocations 200
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3836809
求助须知:如何正确求助?哪些是违规求助? 3379059
关于积分的说明 10507387
捐赠科研通 3098970
什么是DOI,文献DOI怎么找? 1706627
邀请新用户注册赠送积分活动 821137
科研通“疑难数据库(出版商)”最低求助积分说明 772445