计算机科学
嵌入
理论(学习稳定性)
图形
突变
背景(考古学)
集合(抽象数据类型)
蛋白质测序
人工智能
机器学习
计算生物学
理论计算机科学
生物
遗传学
肽序列
基因
古生物学
程序设计语言
作者
Gen Li,Sijie Yao,Long Fan
标识
DOI:10.1021/acs.jcim.3c01697
摘要
Protein thermodynamic stability is essential to clarify the relationships among structure, function, and interaction. Therefore, developing a faster and more accurate method to predict the impact of the mutations on protein stability is helpful for protein design and understanding the phenotypic variation. Recent studies have shown that protein embedding will be particularly powerful at modeling sequence information with context dependence, such as subcellular localization, variant effect, and secondary structure prediction. Herein, we introduce a novel method, ProSTAGE, which is a deep learning method that fuses structure and sequence embedding to predict protein stability changes upon single point mutations. Our model combines graph-based techniques and language models to predict stability changes. Moreover, ProSTAGE is trained on a larger data set, which is almost twice as large as the most used S2648 data set. It consistently outperforms all existing state-of-the-art methods on mutation-affected problems as benchmarked on several independent data sets. The protein embedding as the prediction input achieves better results than the previous results, which shows the potential of protein language models in predicting the effect of mutations on proteins. ProSTAGE is implemented as a user-friendly web server.
科研通智能强力驱动
Strongly Powered by AbleSci AI