发起人
卷积神经网络
计算生物学
计算机科学
人工神经网络
鉴定(生物学)
上游(联网)
图层(电子)
DNA测序
人工智能
DNA
模式识别(心理学)
基因
生物
遗传学
基因表达
电信
纳米技术
植物
材料科学
作者
Zhimin Zhang,Jianping Zhao,Pi-Jing Wei,Chun-Hou Zheng
标识
DOI:10.1016/j.cmpb.2022.107087
摘要
• A new two-layer promoter predictor called iPro2L-CLA to identify promoters and their strength. • In this study, we firstly proposed a new capsule network and recurrent neural network hybrid model to identify promoters and predict their strength. • Our model attains a cross-validation accuracy of 86% and 73.46% in prokaryotic promoter recognition and their strength prediction. The promoter is a fragment of DNA and a specific sequence with transcriptional regulation function in DNA. Promoters are located upstream at the transcription start site, which is used to initiate downstream gene expression. So far, promoter identification is mainly achieved by biological methods, which often require more effort. It has become a more effective classification and prediction method to identify promoter types through computational methods. In this study, we proposed a new capsule network and recurrent neural network hybrid model to identify promoters and predict their strength. Firstly, we used one-hot to encode DNA sequence. Secondly, we used three one-dimensional convolutional layers, a one-dimensional convolutional capsule layer and digit capsule layer to learn local features. Thirdly, a bidirectional long short-time memory was utilized to extract global features. Finally, we adopted the self-attention mechanism to improve the contribution of relatively important features, which further enhances the performance of the model. Our model attains a cross-validation accuracy of 86% and 73.46% in prokaryotic promoter recognition and their strength prediction, which showcases a better performance compared with the existing approaches in both the first layer promoter identification and the second layer promoter's strength prediction. our model not only combines convolutional neural network and capsule layer but also uses a self-attention mechanism to better capture hidden information features from the perspective of sequence. Thus, we hope that our model can be widely applied to other components.
科研通智能强力驱动
Strongly Powered by AbleSci AI