内在无序蛋白质
回转半径
计算机科学
序列(生物学)
生物物理学
杠杆(统计)
集成学习
缩放比例
计算生物学
人工智能
生物
物理
聚合物
数学
遗传学
核磁共振
几何学
作者
Jeffrey M. Lotthammer,Garrett M. Ginell,Daniel Griffith,Ryan J. Emenecker,Alex S. Holehouse
出处
期刊:Nature Methods
[Springer Nature]
日期:2024-01-31
卷期号:21 (3): 465-476
被引量:150
标识
DOI:10.1038/s41592-023-02159-5
摘要
Abstract Intrinsically disordered regions (IDRs) are ubiquitous across all domains of life and play a range of functional roles. While folded domains are generally well described by a stable three-dimensional structure, IDRs exist in a collection of interconverting states known as an ensemble. This structural heterogeneity means that IDRs are largely absent from the Protein Data Bank, contributing to a lack of computational approaches to predict ensemble conformational properties from sequence. Here we combine rational sequence design, large-scale molecular simulations and deep learning to develop ALBATROSS, a deep-learning model for predicting ensemble dimensions of IDRs, including the radius of gyration, end-to-end distance, polymer-scaling exponent and ensemble asphericity, directly from sequences at a proteome-wide scale. ALBATROSS is lightweight, easy to use and accessible as both a locally installable software package and a point-and-click-style interface via Google Colab notebooks. We first demonstrate the applicability of our predictors by examining the generalizability of sequence–ensemble relationships in IDRs. Then, we leverage the high-throughput nature of ALBATROSS to characterize the sequence-specific biophysical behavior of IDRs within and between proteomes.
科研通智能强力驱动
Strongly Powered by AbleSci AI