GSRNet, an adversarial training-based deep framework with multi-scale CNN and BiGRU for predicting genomic signals and regions

计算机科学 人工智能 嵌入 字错误率 特征工程 稳健性(进化) 深度学习 模式识别(心理学) 机器学习 语音识别 基因 生物 生物化学
作者
Gancheng Zhu,Yongchang Fan,Fĕi Li,Annebella Tsz Ho Choi,Zhenyu Tan,Yiruo Cheng,Kewei Li,Siyang Wang,Changfan Luo,Hongmei Li,Gongyou Zhang,Zhaomin Yao,Yaqi Zhang,L. Q. Huang,Fengfeng Zhou
出处
期刊:Expert Systems With Applications [Elsevier BV]
卷期号:229: 120439-120439 被引量:3
标识
DOI:10.1016/j.eswa.2023.120439
摘要

A genome carries many functional genomic signals and regions (GSRs), which play a vital role in orchestrating the complex biological processes in eukaryotic organisms. Precise recognition of the GSRs within a genomic sequence is the first step to an understanding of genomic organization and gene regulation. Previous studies have used machine learning or deep learning algorithms to identify GSRs based on hand-crafted features, that frequently fail to capture complex patterns within the GSRs. The one-hot encoding or word2vec embedding algorithms used in several deep learning-based studies have the potential to overcome the weakness of the human-designed features, but they may fail to capture contextual and positional information. The present study proposes a general-purpose end-to-end framework for GSR prediction (GSRNet), that integrates DNABERT embedding, adversarial training, BiGRU, and multi-scale CNN to eliminate human involvement in feature engineering. The GSRNet is evaluated with polyadenylation signals (PAS) and translation initiation sites (TIS) prediction tasks. The comparative experiments show that the proposed GSRNet outperforms the state-of-the-art methods reported in previous studies, with a drop in the error rate by 1.08% and 1.50% for human PAS and TIS GSR, respectively. Our model reduces the relative error rate up to 8.73% and 32.97%, respectively. The improved detections of the two types of GSRs (PAS and TIS) across four organisms confirmed the effectiveness and robustness of the proposed GSRNet. The source code and the data are freely available at http://www.healthinformaticslab.org/supp/resources.php.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
刚刚
刚刚
旧辞发布了新的文献求助10
1秒前
3秒前
大力思雁关注了科研通微信公众号
4秒前
王哪跑发布了新的文献求助10
4秒前
Wri发布了新的文献求助10
5秒前
是啥发布了新的文献求助10
6秒前
7秒前
8秒前
Roxy完成签到,获得积分10
9秒前
10秒前
在水一方应助远志采纳,获得10
11秒前
爱猫的纭完成签到,获得积分10
12秒前
科研通AI5应助梅花鹿采纳,获得10
13秒前
13秒前
小巧雪碧发布了新的文献求助10
15秒前
干净羊青发布了新的文献求助50
16秒前
希望天下0贩的0应助They_say采纳,获得10
16秒前
石榴石完成签到 ,获得积分20
17秒前
Young4399完成签到 ,获得积分10
17秒前
王哪跑完成签到,获得积分10
17秒前
海棠花未眠完成签到,获得积分10
18秒前
大力思雁发布了新的文献求助20
19秒前
卖萌的秋田完成签到,获得积分10
20秒前
JamesPei应助乔心采纳,获得10
21秒前
自信的坤完成签到,获得积分10
21秒前
21秒前
累了关注了科研通微信公众号
22秒前
22秒前
26秒前
Owen应助小脑门儿采纳,获得10
27秒前
honphyjiang发布了新的文献求助10
27秒前
顾矜应助laochen采纳,获得10
28秒前
梅花鹿发布了新的文献求助10
28秒前
knight完成签到,获得积分10
29秒前
NexusExplorer应助鸢尾采纳,获得10
30秒前
30秒前
二三发布了新的文献求助10
30秒前
思源应助永远采纳,获得10
30秒前
高分求助中
Basic Discrete Mathematics 1000
Technologies supporting mass customization of apparel: A pilot project 600
Introduction to Strong Mixing Conditions Volumes 1-3 500
Tip60 complex regulates eggshell formation and oviposition in the white-backed planthopper, providing effective targets for pest control 400
A Field Guide to the Amphibians and Reptiles of Madagascar - Frank Glaw and Miguel Vences - 3rd Edition 400
China Gadabouts: New Frontiers of Humanitarian Nursing, 1941–51 400
The Healthy Socialist Life in Maoist China, 1949–1980 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3799740
求助须知:如何正确求助?哪些是违规求助? 3345074
关于积分的说明 10323372
捐赠科研通 3061599
什么是DOI,文献DOI怎么找? 1680474
邀请新用户注册赠送积分活动 807075
科研通“疑难数据库(出版商)”最低求助积分说明 763462