Leveraging structure for enzyme function prediction: methods, opportunities, and challenges

计算生物学 功能(生物学) 生物 鉴定(生物学) 蛋白质功能 基因组 生物化学 遗传学 基因 植物
作者
Matthew P. Jacobson,Chakrapani Kalyanaraman,Suwen Zhao,Boxue Tian
出处
期刊:Trends in Biochemical Sciences [Elsevier BV]
卷期号:39 (8): 363-371 被引量:34
标识
DOI:10.1016/j.tibs.2014.05.006
摘要

•Of the >50 million protein sequences, <1% have experimentally determined functions. •Protein structures can provide clues to function, such as the substrates of enzymes. •Homology modeling and ligand docking algorithms can help infer function from structure. •Recent successes include discovery of novel metabolites, enzymes, and pathways. The rapid growth of the number of protein sequences that can be inferred from sequenced genomes presents challenges for function assignment, because only a small fraction (currently <1%) has been experimentally characterized. Bioinformatics tools are commonly used to predict functions of uncharacterized proteins. Recently, there has been significant progress in using protein structures as an additional source of information to infer aspects of enzyme function, which is the focus of this review. Successful application of these approaches has led to the identification of novel metabolites, enzyme activities, and biochemical pathways. We discuss opportunities to elucidate systematically protein domains of unknown function, orphan enzyme activities, dead-end metabolites, and pathways in secondary metabolism. The rapid growth of the number of protein sequences that can be inferred from sequenced genomes presents challenges for function assignment, because only a small fraction (currently <1%) has been experimentally characterized. Bioinformatics tools are commonly used to predict functions of uncharacterized proteins. Recently, there has been significant progress in using protein structures as an additional source of information to infer aspects of enzyme function, which is the focus of this review. Successful application of these approaches has led to the identification of novel metabolites, enzyme activities, and biochemical pathways. We discuss opportunities to elucidate systematically protein domains of unknown function, orphan enzyme activities, dead-end metabolites, and pathways in secondary metabolism. a computational technique that builds an atomic model of a target protein using its sequence and an experimental 3D structure of a homologous protein (called the 'template'). The quality of a homology model depends on the accuracy of the sequence alignment between target and template, which varies (loosely) with the sequence identity (roughly speaking, pairwise identity higher than 40% is ideal, and lower than 25% is poor). a computational technique that predicts and ranks the binding poses of small molecule ligands to receptors (e.g., proteins). Docking usually comprises a sampling method that generates possible binding poses of a ligand in a binding site, and a scoring function that ranks these poses. Most scoring functions are empirical, and give only a crude estimate of the binding free energy of a ligand. biochemical pathways to produce organic molecules (i.e., secondary metabolites) that are not absolutely required for the survival of the organism. There are five particularly prevalent classes of secondary metabolite: isoprenoids, alkaloids, polyketides, nonribosomal peptides, and ribosomally synthesized and post-translationally modified peptides. Secondary metabolites are often restricted to a narrow set of species and have important ecological roles for the organisms that produce them. Many secondary metabolites are bioactive (antibacterial, anticancer, antifungal, antiviral, antioxidant, anti-inflammatory, antiparasitic, antimalaria, cytotoxic, etc.) and have been used as drugs and drug leads. an effort to determine the 3D, atomic-level structure of every protein encoded by a genome through a combination of high-throughput experimental and modeling approaches. The determination of a protein structure though a structural genomics effort often precedes knowledge of its function, motivating the development of methods to infer function from structure.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
灵巧化蛹发布了新的文献求助10
刚刚
LJB完成签到,获得积分10
刚刚
刚刚
刚刚
1秒前
1秒前
刘禹彤完成签到,获得积分10
2秒前
kw发布了新的文献求助10
2秒前
2秒前
2秒前
3秒前
3秒前
3秒前
3秒前
changyouhuang发布了新的文献求助10
4秒前
4秒前
4秒前
4秒前
木南完成签到,获得积分10
4秒前
没有昵称发布了新的文献求助30
5秒前
如如如如完成签到 ,获得积分10
5秒前
5秒前
FashionBoy应助小包子采纳,获得10
6秒前
tian发布了新的文献求助10
6秒前
晴天霹雳3732完成签到,获得积分0
6秒前
sam完成签到,获得积分10
6秒前
123456发布了新的文献求助10
6秒前
6秒前
6秒前
7秒前
oo发布了新的文献求助10
7秒前
千逐完成签到,获得积分10
7秒前
懒大王发布了新的文献求助10
7秒前
8秒前
顾矜应助小松鼠采纳,获得10
8秒前
Owen应助Turing采纳,获得10
8秒前
qinqin发布了新的文献求助10
8秒前
8秒前
蔺文博完成签到,获得积分10
8秒前
乔木木完成签到,获得积分10
8秒前
高分求助中
Adhesion Science: Principles & Practice 1234
Signals, Systems, and Signal Processing 610
Burger's Medicinal Chemistry and Drug Discovery 400
A Step-by-Step Guide to Qualitative Data Coding 2nd Edition 400
Impact of Storage Orientation and Duration on Prefilled Syringe Performance: Break-Loose and Glide Forces, and Injection Time Across Multiple Time Points 360
Programming for Chemical Engineers Using C, C++, and MATLAB 300
Upland Kenya wild flowers and ferns: a flora of the flowers, ferns, grasses, and sedges of highland Kenya 300
热门求助领域 (近24小时)
化学 材料科学 医学 生物 纳米技术 工程类 有机化学 化学工程 生物化学 计算机科学 物理 内科学 复合材料 催化作用 物理化学 光电子学 电极 细胞生物学 基因 无机化学
热门帖子
关注 科研通微信公众号,转发送积分 6665927
求助须知:如何正确求助?哪些是违规求助? 8415462
关于积分的说明 17989617
捐赠科研通 5872202
什么是DOI,文献DOI怎么找? 2975948
邀请新用户注册赠送积分活动 1951803
关于科研通互助平台的介绍 1878907