序列母题
序列标志
序列比对
蛋白质数据库的结构分类
序列(生物学)
计算生物学
蛋白质家族
序列数据库
肽序列
编号
结构线形
共识序列
结构母题
多序列比对
蛋白质测序
生物
遗传学
蛋白质结构
计算机科学
生物化学
算法
DNA
基因
作者
Maike Gräff,Patrick C. F. Buchholz,Peter Stockinger,Bettina Bommarius,Andreas S. Bommarius,Jürgen Pleiss
出处
期刊:Proteins
[Wiley]
日期:2019-02-04
卷期号:87 (6): 443-451
被引量:43
摘要
Abstract The Short‐chain Dehydrogenases/Reductases Engineering Database (SDRED) covers one of the largest known protein families (168 150 proteins). Assignment to the superfamilies of Classical and Extended SDRs was achieved by global sequence similarity and by identification of family‐specific sequence motifs. Two standard numbering schemes were established for Classical and Extended SDRs that allow for the determination of conserved amino acid residues, such as cofactor specificity determining positions or superfamily specific sequence motifs. The comprehensive sequence dataset of the SDRED facilitates the refinement of family‐specific sequence motifs. The glycine‐rich motifs for Classical and Extended SDRs were refined to improve the precision of superfamily classification. In each superfamily, the majority of sequences formed a tightly connected sequence network and belonged to a large homologous family. Despite their different sequence motifs and their different sequence length, the two sequence networks of Classical and Extended SDRs are not separate, but connected by edges at a threshold of 40% sequence similarity, indicating that all SDRs belong to a large, connected network. The SDRED is accessible at https://sdred.biocatnet.de/.
科研通智能强力驱动
Strongly Powered by AbleSci AI