Improved pathogenicity prediction for rare human missense variants

错义突变 致病性 推论 计算机科学 人工智能 机器学习 注释 利用 极限(数学) 特征(语言学) 个性化医疗 计算生物学 数据挖掘 遗传学 生物 突变 数学 基因 微生物学 数学分析 语言学 哲学 计算机安全
作者
Yingzhou Wu,Hanqing Liu,Roujia Li,Song Sun,Jochen Weile,Frederick P. Roth
出处
期刊:American Journal of Human Genetics [Elsevier BV]
卷期号:108 (10): 1891-1906 被引量:105
标识
DOI:10.1016/j.ajhg.2021.08.012
摘要

The success of personalized genomic medicine depends on our ability to assess the pathogenicity of rare human variants, including the important class of missense variation. There are many challenges in training accurate computational systems, e.g., in finding the balance between quantity, quality, and bias in the variant sets used as training examples and avoiding predictive features that can accentuate the effects of bias. Here, we describe VARITY, which judiciously exploits a larger reservoir of training examples with uncertain accuracy and representativity. To limit circularity and bias, VARITY excludes features informed by variant annotation and protein identity. To provide a rationale for each prediction, we quantified the contribution of features and feature combinations to the pathogenicity inference of each variant. VARITY outperformed all previous computational methods evaluated, identifying at least 10% more pathogenic variants at thresholds achieving high (90% precision) stringency. The success of personalized genomic medicine depends on our ability to assess the pathogenicity of rare human variants, including the important class of missense variation. There are many challenges in training accurate computational systems, e.g., in finding the balance between quantity, quality, and bias in the variant sets used as training examples and avoiding predictive features that can accentuate the effects of bias. Here, we describe VARITY, which judiciously exploits a larger reservoir of training examples with uncertain accuracy and representativity. To limit circularity and bias, VARITY excludes features informed by variant annotation and protein identity. To provide a rationale for each prediction, we quantified the contribution of features and feature combinations to the pathogenicity inference of each variant. VARITY outperformed all previous computational methods evaluated, identifying at least 10% more pathogenic variants at thresholds achieving high (90% precision) stringency.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
赘婿应助小巧的傲易采纳,获得30
刚刚
Jancy发布了新的文献求助10
1秒前
pluto应助负责的方盒采纳,获得10
2秒前
1111完成签到 ,获得积分10
6秒前
jenningseastera举报dylan求助涉嫌违规
6秒前
麦兜做科研完成签到 ,获得积分10
6秒前
科研通AI5应助张张采纳,获得10
7秒前
ll完成签到 ,获得积分20
8秒前
zzzk完成签到 ,获得积分10
9秒前
11秒前
kehe!完成签到 ,获得积分0
19秒前
oldblack完成签到 ,获得积分10
22秒前
jenningseastera举报together73W求助涉嫌违规
23秒前
张张完成签到,获得积分10
24秒前
LFY完成签到 ,获得积分10
29秒前
30秒前
32秒前
Michael完成签到 ,获得积分10
35秒前
aixiaoming0503完成签到,获得积分10
35秒前
冷傲的小之完成签到 ,获得积分10
36秒前
清图发布了新的文献求助10
38秒前
神外第一刀完成签到 ,获得积分10
39秒前
胡萝卜发布了新的文献求助10
41秒前
44秒前
44秒前
cyy112358发布了新的文献求助10
48秒前
666发布了新的文献求助10
48秒前
Ava应助尤静柏采纳,获得10
49秒前
手抓饼啊发布了新的文献求助10
51秒前
52秒前
HR112完成签到 ,获得积分10
52秒前
yu完成签到,获得积分10
53秒前
55秒前
研友_Z6QEAn完成签到 ,获得积分10
57秒前
57秒前
满满啊完成签到 ,获得积分10
1分钟前
小蘑菇应助yu采纳,获得10
1分钟前
cc完成签到 ,获得积分10
1分钟前
无心的紫山完成签到,获得积分10
1分钟前
Hello应助chuanxizheng采纳,获得10
1分钟前
高分求助中
【此为提示信息,请勿应助】请按要求发布求助,避免被关 20000
Continuum Thermodynamics and Material Modelling 2000
Encyclopedia of Geology (2nd Edition) 2000
105th Edition CRC Handbook of Chemistry and Physics 1600
Maneuvering of a Damaged Navy Combatant 650
Периодизация спортивной тренировки. Общая теория и её практическое применение 310
Mixing the elements of mass customisation 300
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3779613
求助须知:如何正确求助?哪些是违规求助? 3325127
关于积分的说明 10221318
捐赠科研通 3040220
什么是DOI,文献DOI怎么找? 1668678
邀请新用户注册赠送积分活动 798766
科研通“疑难数据库(出版商)”最低求助积分说明 758535