Classification of domains in predicted structures of the human proteome

计算生物学 人类蛋白质组计划 蛋白质组 蛋白质结构域 生物 同源(生物学) 人类蛋白质 序列比对 同源建模 蛋白质结构 肽序列 生物信息学 计算机科学 人工智能 蛋白质组学 氨基酸 遗传学 基因 生物化学
作者
R. Dustin Schaeffer,Jing Zhang,Lisa N. Kinch,Jimin Pei,Nick V. Grishin
出处
期刊:Proceedings of the National Academy of Sciences of the United States of America [Proceedings of the National Academy of Sciences]
卷期号:120 (12)
标识
DOI:10.1073/pnas.2214069120
摘要

Recent advances in protein structure prediction have generated accurate structures of previously uncharacterized human proteins. Identifying domains in these predicted structures and classifying them into an evolutionary hierarchy can reveal biological insights. Here, we describe the detection and classification of domains from the human proteome. Our classification indicates that only 62% of residues are located in globular domains. We further classify these globular domains and observe that the majority (65%) can be classified among known folds by sequence, with a smaller fraction (33%) requiring structural data to refine the domain boundaries and/or to support their homology. A relatively small number (966 domains) cannot be confidently assigned using our automatic pipelines, thus demanding manual inspection. We classify 47,576 domains, of which only 23% have been included in experimental structures. A portion (6.3%) of these classified globular domains lack sequence-based annotation in InterPro. A quarter (23%) have not been structurally modeled by homology, and they contain 2,540 known disease-causing single amino acid variations whose pathogenesis can now be inferred using AF models. A comparison of classified domains from a series of model organisms revealed expansions of several immune response-related domains in humans and a depletion of olfactory receptors. Finally, we use this classification to expand well-known protein families of biological significance. These classifications are presented on the ECOD website ( http://prodata.swmed.edu/ecod/index_human.php ).

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
kk发布了新的文献求助10
刚刚
刚刚
1秒前
传奇3应助好家伙采纳,获得10
1秒前
正直的夏真完成签到 ,获得积分10
1秒前
旺旺小仙发布了新的文献求助20
1秒前
1秒前
不会吹口哨完成签到,获得积分10
2秒前
汉堡包应助stws采纳,获得10
2秒前
所所应助adoudoo采纳,获得10
2秒前
微笑的秀儿完成签到,获得积分10
3秒前
现在完成签到,获得积分10
4秒前
4秒前
二二发布了新的文献求助10
4秒前
一昂发布了新的文献求助10
5秒前
郭小白发布了新的文献求助10
6秒前
6秒前
6秒前
6秒前
6秒前
小刘鸭鸭完成签到,获得积分10
6秒前
Xio关注了科研通微信公众号
7秒前
小二郎应助love采纳,获得10
7秒前
ocean完成签到,获得积分10
8秒前
H-China发布了新的文献求助10
8秒前
Joyful发布了新的文献求助10
8秒前
英俊的铭应助HY采纳,获得10
8秒前
8秒前
9秒前
耍酷芙蓉发布了新的文献求助10
9秒前
slby完成签到 ,获得积分10
9秒前
TCcc发布了新的文献求助10
10秒前
10秒前
10秒前
Leonard发布了新的文献求助10
11秒前
11秒前
极个别同志完成签到,获得积分10
11秒前
Ava应助vtfangfangfang采纳,获得10
11秒前
12秒前
宸哥发布了新的文献求助10
13秒前
高分求助中
Modern Epidemiology, Fourth Edition 5000
Kinesiophobia : a new view of chronic pain behavior 5000
Molecular Biology of Cancer: Mechanisms, Targets, and Therapeutics 3000
Digital Twins of Advanced Materials Processing 2000
Propeller Design 2000
Weaponeering, Fourth Edition – Two Volume SET 2000
Handbook of pharmaceutical excipients, Ninth edition 1500
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 纳米技术 化学工程 生物化学 物理 计算机科学 内科学 复合材料 催化作用 物理化学 光电子学 电极 冶金 细胞生物学 基因
热门帖子
关注 科研通微信公众号,转发送积分 6010478
求助须知:如何正确求助?哪些是违规求助? 7555388
关于积分的说明 16133564
捐赠科研通 5157072
什么是DOI,文献DOI怎么找? 2762231
邀请新用户注册赠送积分活动 1740811
关于科研通互助平台的介绍 1633435