A method and server for predicting damaging missense mutations

遗传学 生物 计算生物学 计算机科学 错义突变 突变 基因
作者
Ivan Adzhubei,Steffen Schmidt,Leonid Peshkin,Vasily Ramensky,Anna Gerasimova,Peer Bork,Alexey S. Kondrashov,Shamil Sunyaev
出处
期刊:Nature Methods [Nature Portfolio]
卷期号:7 (4): 248-249 被引量:10657
标识
DOI:10.1038/nmeth0410-248
摘要

To the Editor: Applications of rapidly advancing sequencing technologies exacerbate the need to interpret individual sequence variants. Sequencing of phenotyped clinical subjects will soon become a method of choice in studies of the genetic causes of Mendelian and complex diseases. New exon capture techniques will direct sequencing efforts towards the most informative and easily interpretable protein-coding fraction of the genome. Thus, the demand for computational predictions of the impact of protein sequence variants will continue to grow. Here we present a new method and the corresponding software tool, PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/), which is different from the early tool PolyPhen1 in the set of predictive features, alignment pipeline, and the method of classification (Fig. 1a). PolyPhen-2 uses eight sequence-based and three structure-based predictive features (Supplementary Table 1) which were selected automatically by an iterative greedy algorithm (Supplementary Methods). Majority of these features involve comparison of a property of the wild-type (ancestral, normal) allele and the corresponding property of the mutant (derived, disease-causing) allele, which together define an amino acid replacement. Most informative features characterize how well the two human alleles fit into the pattern of amino acid replacements within the multiple sequence alignment of homologous proteins, how distant the protein harboring the first deviation from the human wild-type allele is from the human protein, and whether the mutant allele originated at a hypermutable site2. The alignment pipeline selects the set of homologous sequences for the analysis using a clustering algorithm and then constructs and refines their multiple alignment (Supplementary Fig. 1). The functional significance of an allele replacement is predicted from its individual features (Supplementary Figs. 2–4) by Naive Bayes classifier (Supplementary Methods). Figure 1 PolyPhen-2 pipeline and prediction accuracy. (a) Overview of the algorithm. (b) Receiver operating characteristic (ROC) curves for predictions made by PolyPhen-2 using five-fold cross-validation on HumDiv (red) and HumVar3 (light green). UniRef100 (solid ... We used two pairs of datasets to train and test PolyPhen-2. We compiled the first pair, HumDiv, from all 3,155 damaging alleles with known effects on the molecular function causing human Mendelian diseases, present in the UniProt database, together with 6,321 differences between human proteins and their closely related mammalian homologs, assumed to be non-damaging (Supplementary Methods). The second pair, HumVar3, consists of all the 13,032 human disease-causing mutations from UniProt, together with 8,946 human nsSNPs without annotated involvement in disease, which were treated as non-damaging. We found that PolyPhen-2 performance, as presented by its receiver operating characteristic curves, was consistently superior compared to PolyPhen (Fig. 1b) and it also compared favorably with the three other popular prediction tools4–6 (Fig. 1c). For a false positive rate of 20%, PolyPhen-2 achieves the rate of true positive predictions of 92% and 73% on HumDiv and HumVar, respectively (Supplementary Table 2). One reason for a lower accuracy of predictions on HumVar is that nsSNPs assumed to be non-damaging in HumVar contain a sizable fraction of mildly deleterious alleles. In contrast, most of amino acid replacements assumed non-damaging in HumDiv must be close to selective neutrality. Because alleles that are even mildly but unconditionally deleterious cannot be fixed in the evolving lineage, no method based on comparative sequence analysis is ideal for discriminating between drastically and mildly deleterious mutations, which are assigned to the opposite categories in HumVar. Another reason is that HumDiv uses an extra criterion to avoid possible erroneous annotations of damaging mutations. For a mutation, PolyPhen-2 calculates Naive Bayes posterior probability that this mutation is damaging and reports estimates of false positive (the chance that the mutation is classified as damaging when it is in fact non-damaging) and true positive (the chance that the mutation is classified as damaging when it is indeed damaging) rates. A mutation is also appraised qualitatively, as benign, possibly damaging, or probably damaging (Supplementary Methods). The user can choose between HumDiv- and HumVar-trained PolyPhen-2. Diagnostics of Mendelian diseases requires distinguishing mutations with drastic effects from all the remaining human variation, including abundant mildly deleterious alleles. Thus, HumVar-trained PolyPhen-2 should be used for this task. In contrast, HumDiv-trained PolyPhen-2 should be used for evaluating rare alleles at loci potentially involved in complex phenotypes, dense mapping of regions identified by genome-wide association studies, and analysis of natural selection from sequence data, where even mildly deleterious alleles must be treated as damaging.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
曾经小伙完成签到 ,获得积分10
3秒前
Wang发布了新的文献求助10
3秒前
哥哥完成签到,获得积分10
10秒前
loren313完成签到,获得积分0
12秒前
JamesPei应助一鸣大人采纳,获得10
14秒前
16秒前
烤鸭完成签到 ,获得积分10
22秒前
Niniiii发布了新的文献求助10
22秒前
back you up应助科研通管家采纳,获得30
27秒前
cdercder应助科研通管家采纳,获得10
27秒前
27秒前
威武的匕完成签到 ,获得积分10
29秒前
发嗲的慕蕊完成签到 ,获得积分10
30秒前
刘丰完成签到 ,获得积分10
31秒前
任伟超完成签到,获得积分10
36秒前
Niniiii完成签到,获得积分10
38秒前
Ding-Ding完成签到,获得积分10
41秒前
46秒前
马美丽完成签到 ,获得积分10
49秒前
不灭发布了新的文献求助10
51秒前
怡然的乘风完成签到 ,获得积分10
54秒前
正直的松鼠完成签到 ,获得积分10
1分钟前
long0809完成签到,获得积分10
1分钟前
缥缈的闭月完成签到,获得积分10
1分钟前
HC完成签到 ,获得积分10
1分钟前
稳重紫蓝完成签到 ,获得积分10
1分钟前
马香芦完成签到,获得积分10
1分钟前
自信松思完成签到 ,获得积分10
1分钟前
跳跃太清完成签到 ,获得积分10
1分钟前
木之尹完成签到 ,获得积分10
1分钟前
啵妞完成签到 ,获得积分10
1分钟前
幸福果汁完成签到 ,获得积分10
1分钟前
你好完成签到 ,获得积分10
1分钟前
cc完成签到,获得积分10
1分钟前
卓初露完成签到 ,获得积分10
1分钟前
zhilianghui0807完成签到 ,获得积分10
1分钟前
又又完成签到,获得积分10
1分钟前
652183758完成签到 ,获得积分10
1分钟前
holy完成签到 ,获得积分10
2分钟前
笨笨忘幽完成签到,获得积分10
2分钟前
高分求助中
Les Mantodea de Guyane Insecta, Polyneoptera 2500
Mobilization, center-periphery structures and nation-building 600
Technologies supporting mass customization of apparel: A pilot project 450
China—Art—Modernity: A Critical Introduction to Chinese Visual Expression from the Beginning of the Twentieth Century to the Present Day 430
Tip60 complex regulates eggshell formation and oviposition in the white-backed planthopper, providing effective targets for pest control 400
A Field Guide to the Amphibians and Reptiles of Madagascar - Frank Glaw and Miguel Vences - 3rd Edition 400
China Gadabouts: New Frontiers of Humanitarian Nursing, 1941–51 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3792550
求助须知:如何正确求助?哪些是违规求助? 3336787
关于积分的说明 10282126
捐赠科研通 3053566
什么是DOI,文献DOI怎么找? 1675652
邀请新用户注册赠送积分活动 803629
科研通“疑难数据库(出版商)”最低求助积分说明 761468