合成生物学
枯草芽孢杆菌
酿酒酵母
生物生产
计算生物学
生物
基因
代谢工程
系统生物学
计算机科学
遗传学
细菌
作者
C. Wang,Wei Zhang,Rongzhen Tian,Jianing Zhang,Linpei Zhang,Zhaohong Deng,Xueqin Lv,Jianghua Li,Long Liu,Guocheng Du,Yanfeng Liu
标识
DOI:10.1002/biot.202100655
摘要
Abstract N‐terminal coding sequences (NCSs) are key regulatory elements for fine‐tuning gene expression during translation initiation—the rate‐limiting step of translation. However, owing to the complex combinatory effects of NCS biophysical factors and endogenous regulation, designing NCSs remains challenging. In this study, a multi‐view learning strategy for model‐driven generation of synthetic NCSs for Saccharomyces cerevisiae and Bacillus subtilis are implemented, which are widely used in laboratories and industries. NCS libraries for S. cerevisiae and B. subtilis with nearly 150,000 cells were sorted. Next, model training was performed with NCS deep features extracted from DNA, codon, and amino acid sequences, as well as calculated features from the minimum free energy (MFE) and tRNA adaption index. Two models were separately developed for generating synthetic NCSs for both up‐ and down‐regulating gene expression with accuracies higher than 65% for S. cerevisiae and B. subtilis . Synthetic NCSs were then applied to enhance bioproduction, yielding 1.48‐ and 1.71‐fold production improvements of D‐limonene by S. cerevisiae and ovalbumin by B. subtilis , respectively. This work provides model‐driven design of synthetic NCSs as a toolbox for regulating gene expression in S. cerevisiae and B. subtilis . The machine learning‐based modeling approach can be used for NCS design in other microorganisms.
科研通智能强力驱动
Strongly Powered by AbleSci AI