计算机科学
图形
溶解度
卷积神经网络
人工智能
蛋白质结构预测
机器学习
蛋白质结构
理论计算机科学
化学
生物化学
有机化学
作者
Jie Chen,Yurong Qian,Zhijian Huang,Xiaojun Xiao,Lei Deng
标识
DOI:10.1109/bibm58861.2023.10385858
摘要
Achieving optimal protein solubility is pivotal for efficient high-throughput purification, especially in industrial settings. However, conventional experimental techniques for assessing protein solubility in such contexts are not only costly but also time-intensive. Currently, numerous methods are available for predicting protein solubility, yet their effectiveness remains limited. Most of these approaches are predominantly sequence-based, failing to harness the invaluable structural insights inherent in proteins. Addressing these limitations, we introduce PPSol, an innovative protein solubility prediction methodology. Operating on protein sequences, PPSol employs ESM2 to predict protein contact maps, forming the basis for constructing protein graphs. Subsequently, well-established techniques are employed to predict protein feature representations as node features, including the utilization of the Position-Specific Scoring Matrix (PSSM). The resulting graph is fed into a graph convolutional neural network (GCN), enabling the acquisition of spatial structural information from proteins. Concurrently, ESM2-generated features undergo dimensional reduction via fully connected layers, integrating into every layer of the GCN for precise protein solubility prediction. Our approach excels through the fusion of pre-trained protein language models and GCNs, surpassing existing methodologies. Notably, PPSol attains state-of-the-art performance, showcasing a remarkable 2.8% enhancement in AUROC performance com-pared to prior strategies.
科研通智能强力驱动
Strongly Powered by AbleSci AI