序列空间
计算机科学
序列(生物学)
人工智能
功能(生物学)
工作流程
蛋白质工程
机器学习
蛋白质测序
蛋白质功能
生物信息学
生成语法
蛋白质功能预测
计算生物学
生物
肽序列
生物化学
遗传学
数学
数据库
进化生物学
基因
纯数学
巴拿赫空间
酶
作者
Chase R Freschlin,Sarah A Fahlberg,Philip A. Romero
标识
DOI:10.1016/j.copbio.2022.102713
摘要
Machine learning (ML) is revolutionizing our ability to understand and predict the complex relationships between protein sequence, structure, and function. Predictive sequence-function models are enabling protein engineers to efficiently search the sequence space for useful proteins with broad applications in biotechnology. In this review, we highlight the recent advances in applying ML to protein engineering. We discuss supervised learning methods that infer the sequence-function mapping from experimental data and new sequence representation strategies for data-efficient modeling. We then describe the various ways in which ML can be incorporated into protein engineering workflows, including purely in silico searches, ML-assisted directed evolution, and generative models that can learn the underlying distribution of the protein function in a sequence space. ML-driven protein engineering will become increasingly powerful with continued advances in high-throughput data generation, data science, and deep learning.
科研通智能强力驱动
Strongly Powered by AbleSci AI