生物信息学
序列(生物学)
计算机科学
功能(生物学)
机器学习
人工智能
序列空间
生成语法
蛋白质功能
计算生物学
定向分子进化
定向进化
生物
遗传学
数学
基因
巴拿赫空间
突变体
纯数学
作者
Bruce J. Wittmann,Kadina E. Johnston,Zachary Wu,Frances H. Arnold
标识
DOI:10.1016/j.sbi.2021.01.008
摘要
Machine learning (ML) can expedite directed evolution by allowing researchers to move expensive experimental screens in silico. Gathering sequence-function data for training ML models, however, can still be costly. In contrast, raw protein sequence data is widely available. Recent advances in ML approaches use protein sequences to augment limited sequence-function data for directed evolution. We highlight contributions in a growing effort to use sequences to reduce or eliminate the amount of sequence-function data needed for effective in silico screening. We also highlight approaches that use ML models trained on sequences to generate new functional sequence diversity, focusing on strategies that use these generative models to efficiently explore vast regions of protein space.
科研通智能强力驱动
Strongly Powered by AbleSci AI