AlphaFold2's training set powers its predictions of some fold‐switched conformations

计算生物学训练集蛋白质二级结构集合（抽象数据类型）蛋白质结构生物系统生物计算机科学人工智能生物化学程序设计语言

作者

Joseph W. Schafer,Lauren L. Porter

出处

期刊：Protein Science [Wiley]
日期：2025-03-25 卷期号：34 (4)

链接

nih.govdoi.org

标识

DOI：10.1002/pro.70105

摘要

AlphaFold2 (AF2), a deep-learning-based model that predicts protein structures from their amino acid sequences, has recently been used to predict multiple protein conformations. In some cases, AF2 has successfully predicted both dominant and alternative conformations of fold-switching proteins, which remodel their secondary and/or tertiary structures in response to cellular stimuli. Whether AF2 has learned enough protein folding principles to reliably predict alternative conformations outside of its training set is unclear. Previous work suggests that AF2 predicted these alternative conformations by memorizing them during training. Here, we use CFold-an implementation of the AF2 network trained on a more limited subset of experimentally determined protein structures-to directly test how well the AF2 architecture predicts alternative conformations of fold switchers outside of its training set. We tested CFold on eight fold switchers from six protein families. These proteins-whose secondary structures switch between α-helix and β-sheet and/or whose hydrogen bonding networks are reconfigured dramatically-had not been tested previously, and only one of their alternative conformations was in CFold's training set. Successful CFold predictions would indicate that the AF2 architecture can predict disparate alternative conformations of fold-switched conformations outside of its training set, while unsuccessful predictions would suggest that AF2 predictions of these alternative conformations likely arise from association with structures learned during training. Despite sampling 1300-4300 structures/protein with various sequence sampling techniques, CFold predicted only one alternative structure outside of its training set accurately and with high confidence while also generating experimentally inconsistent structures with higher confidence. Though these results indicate that AF2's current success in predicting alternative conformations of fold switchers stems largely from its training data, results from a sequence pruning technique suggest developments that could lead to a more reliable generative model in the future.

求助该文献

最长约 10秒，即可获得该文献文件

AlphaFold2's training set powers its predictions of some fold‐switched conformations

今日热心研友