序列(生物学)
化学
蛋白质设计
肽序列
蛋白质测序
回路建模
理论(学习稳定性)
蛋白质结构
计算机科学
结晶学
蛋白质结构预测
生物化学
机器学习
基因
作者
Richard W. Shuai,Talal Widatalla,Po‐Ssu Huang,Brian Hie
标识
DOI:10.1101/2025.02.13.637498
摘要
Leading deep learning-based methods for fixed-backbone protein sequence design do not model protein sidechain conformation during sequence generation despite the large role the three-dimensional arrangement of sidechain atoms play in protein conformation, stability, and overall protein function. Instead, these models implicitly reason about crucial sidechain interactions based on backbone geometry and known amino acid sequence labels. To address this, we present FAMPNN (Full-Atom MPNN), a sequence design method that explicitly models both sequence identity and sidechain conformation for each residue, where the per-token distribution of a residue's discrete amino acid identity and its continuous sidechain conformation are learned with a combined categorical cross-entropy and diffusion loss objective. We demonstrate that learning these distributions jointly is a highly synergistic task that both improves sequence recovery while achieving state-of-the-art sidechain packing. Furthermore, benefits from explicit full-atom modeling generalize from sequence recovery to practical protein design applications, such as zero-shot prediction of experimental binding and stability measurements.
科研通智能强力驱动
Strongly Powered by AbleSci AI