Inadequacy of Evolutionary Profiles Vis-a-vis Single Sequences in Predicting Transient DNA-Binding Sites in Proteins

DNA 计算生物学 DNA结合位点 生物 结合位点 线程(蛋白质序列) DNA测序 DNA结合蛋白 蛋白质结构 遗传学 生物物理学 生物化学 基因 发起人 转录因子 基因表达
作者
Ajay Arya,Dana Mary Varghese,Ajay Verma,Shandar Ahmad
出处
期刊:Journal of Molecular Biology [Elsevier]
卷期号:434 (13): 167640-167640 被引量:1
标识
DOI:10.1016/j.jmb.2022.167640
摘要

Sequence-based prediction of DNA-binding residues in a protein is a widely studied problem for which machine learning methods with continuously improving predictive power have been developed. Concatenated rows within a sliding window of a Position Specific Substitution Matrix (PSSM) of the protein are currently used as the primary feature set in almost all the methods of predicting DNA-binding residues. Here we report that these evolutionary profiles are powerful, only for identifying conserved binding sites and fall short for the residue positions which undergo binding to non-binding transitions in closely related proteins. We created a database of highly similar protein pairs with known protein-DNA complexes and investigated differential predictability of conserved and transient binding residues within each pair. Retraining machine learning models uniformly, we compared the predictive powers of the models trained on PSSMs against similarly trained models on sparse-encoded single sequences. We found that the transient binding site predictions from evolutionary profiles are outperformed by single-sequence based models under controlled experiments by as much as 8 percentage points. Thus, we conclude that the PSSM-based models are inadequate to predict high-specificity DNA-binding residues. These findings are of critical significance for the design of mutant- and species-specific DNA ligands and for homology based modeling of protein-DNA complexes.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
烟花应助叾屾采纳,获得10
2秒前
FashionBoy应助Cindy采纳,获得30
4秒前
等待寄云发布了新的文献求助10
5秒前
6秒前
8秒前
孙伟健发布了新的文献求助10
8秒前
8秒前
xuchenglong发布了新的文献求助10
9秒前
10秒前
雷寒云发布了新的文献求助30
13秒前
科目三应助花与爱采纳,获得10
15秒前
搜集达人应助陈陈采纳,获得10
16秒前
xuchenglong完成签到,获得积分10
17秒前
顾矜应助李吉婷采纳,获得10
19秒前
多多完成签到,获得积分10
20秒前
21秒前
天天快乐应助yangkao采纳,获得10
24秒前
安年发布了新的文献求助10
25秒前
26秒前
26秒前
28秒前
春田发布了新的文献求助10
30秒前
花与爱完成签到,获得积分10
30秒前
张泽崇应助赤安采纳,获得20
30秒前
陈陈发布了新的文献求助10
31秒前
李吉婷发布了新的文献求助10
32秒前
隐形曼青应助安年采纳,获得10
33秒前
耍酷夜阑发布了新的文献求助30
33秒前
33秒前
35秒前
36秒前
37秒前
cosmos007完成签到,获得积分10
38秒前
38秒前
Neo发布了新的文献求助10
39秒前
YF完成签到 ,获得积分10
40秒前
莱温舅妈关注了科研通微信公众号
40秒前
40秒前
yangkao发布了新的文献求助10
41秒前
42秒前
高分求助中
The three stars each : the Astrolabes and related texts 1070
Manual of Clinical Microbiology, 4 Volume Set (ASM Books) 13th Edition 1000
Sport in der Antike 800
De arte gymnastica. The art of gymnastics 600
少脉山油柑叶的化学成分研究 530
Sport in der Antike Hardcover – March 1, 2015 500
Boris Pesce - Gli impiegati della Fiat dal 1955 al 1999 un percorso nella memoria 500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 有机化学 工程类 生物化学 纳米技术 物理 内科学 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 电极 光电子学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 2404877
求助须知:如何正确求助?哪些是违规求助? 2103340
关于积分的说明 5308232
捐赠科研通 1830745
什么是DOI,文献DOI怎么找? 912234
版权声明 560529
科研通“疑难数据库(出版商)”最低求助积分说明 487712