序列(生物学)
扩散
计算机科学
化学
热力学
物理
生物化学
作者
Sitao Zhang,Zixuan Jiang,Rundong Huang,Wenting Huang,Shushi Peng,Shaoxun Mo,Letao Zhu,Peiheng Li,Ziyi Zhang,Emily Pan,Xi Chen,Y. F. Long,Liang Qi,Jin Tang,Renjing Xu,Rui Qing
标识
DOI:10.1002/advs.202502723
摘要
Abstract The diffusion model has grasped enormous attention in the computer vision field and emerged as a promising algorithm in protein design for precise structure and sequence generation. Here PRO‐LDM is introduced: a modular multi‐tasking framework combining design fidelity and computational efficiency, by integrating the diffusion model in latent space. The model learns biological representations at local and global levels, to design natural‐like species with enhanced diversity, or optimize protein properties and functions. Its modular nature also enables the integration with alternative pre‐trained encoders for enhanced generalization capability. Outlier design can be implemented by adjusting the classifier‐free guidance that enables PRO‐LDM to sample vastly different regions in the latent space. The approach is demonstrated in generating a novel green‐fluorescence‐protein variant with notably enhanced fluorescence in multiple working scenarios along with increased solubility and stability. The model provides a versatile tool to effectively extract physicochemical and evolutionary information in sequences for designing new proteins with optimized performances.
科研通智能强力驱动
Strongly Powered by AbleSci AI