发起人
卷积神经网络
大肠杆菌
人工智能
数据库
计算生物学
生物
机器学习
计算机科学
遗传学
基因
基因表达
作者
Yukuan Huang,Chi‐Hua Yu,I‐Son Ng
标识
DOI:10.1016/j.jtice.2023.105211
摘要
Promoter strength plays a critical role in modulating protein expression in genetic engineering. However, there are only a few studies on the strength of promoters from the comprehensive genomic database of sigma factors. To circumvent the time and resource-intensive experimental approach, artificial intelligence (AI) is considered to construct a complete database of proposed promoters from Escherichia coli, and further utilizing prediction algorithms to evaluate the promoter strength and confirmed using intensity of green fluorescent protein (GFP). The promoter database was constructed using partial information from Ecocyc, and predictive strength of the promoters was calculated via the phiSITE hunter tool. Among the 1744 promoter entries in the database were derived from E. coli MG1655, while total of 935 sigma factor 70 (σ70) promoters were identified. Then, the training database was applied to develop a precise tool for predicting promoter strength using machine learning and six deep learning models. The accuracy of predictions was confirmed through wet experiments conducted on endogenous and J-series promoters. By employing a deep learning model, particularly the Convolutional Neural Network (CNN), the promoter prediction fitness of phiSITE, which relied on traditional alignment metrics, was approved. On the other hand, phiSITE demonstrated satisfied result in the fluorescence experiments using 7 endogenous promoters, achieving an R-squared (R2) at 0.93. When applied the same model to predict the strength of J-series promoters, the best R2 achieved 0.99. Thus, CNN model represents as an effective evaluation of AI-based promoter strength.
科研通智能强力驱动
Strongly Powered by AbleSci AI