BERT-TFBS: a novel BERT-based model for predicting transcription factor binding sites by transfer learning

计算机科学 DNA结合位点 编码器 卷积神经网络 编码 学习迁移 人工智能 源代码 深度学习 机器学习 发起人 基因 生物 遗传学 基因表达 操作系统
作者
Kai Wang,Xuan Zeng,Jingwen Zhou,Fei Liu,Xiaoli Luan,Xinglong Wang
出处
期刊:Briefings in Bioinformatics [Oxford University Press]
卷期号:25 (3) 被引量:7
标识
DOI:10.1093/bib/bbae195
摘要

Abstract Transcription factors (TFs) are proteins essential for regulating genetic transcriptions by binding to transcription factor binding sites (TFBSs) in DNA sequences. Accurate predictions of TFBSs can contribute to the design and construction of metabolic regulatory systems based on TFs. Although various deep-learning algorithms have been developed for predicting TFBSs, the prediction performance needs to be improved. This paper proposes a bidirectional encoder representations from transformers (BERT)-based model, called BERT-TFBS, to predict TFBSs solely based on DNA sequences. The model consists of a pre-trained BERT module (DNABERT-2), a convolutional neural network (CNN) module, a convolutional block attention module (CBAM) and an output module. The BERT-TFBS model utilizes the pre-trained DNABERT-2 module to acquire the complex long-term dependencies in DNA sequences through a transfer learning approach, and applies the CNN module and the CBAM to extract high-order local features. The proposed model is trained and tested based on 165 ENCODE ChIP-seq datasets. We conducted experiments with model variants, cross-cell-line validations and comparisons with other models. The experimental results demonstrate the effectiveness and generalization capability of BERT-TFBS in predicting TFBSs, and they show that the proposed model outperforms other deep-learning models. The source code for BERT-TFBS is available at https://github.com/ZX1998-12/BERT-TFBS.
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
星辰大海应助zhouzhoufighting采纳,获得10
刚刚
草上飞发布了新的文献求助200
刚刚
xien发布了新的文献求助10
1秒前
1秒前
Orange应助Ted采纳,获得10
2秒前
韩凌完成签到,获得积分10
3秒前
无私语儿发布了新的文献求助10
3秒前
3秒前
一条裸游的鱼完成签到,获得积分10
4秒前
ZhouYW应助小熊采纳,获得10
4秒前
薯条狂热爱好者完成签到 ,获得积分10
4秒前
satori完成签到,获得积分10
6秒前
6秒前
6秒前
ding应助无私语儿采纳,获得10
6秒前
亻圭发布了新的文献求助30
7秒前
MrFamous完成签到,获得积分10
7秒前
佟语雪完成签到,获得积分10
7秒前
yolee完成签到,获得积分10
7秒前
luf完成签到,获得积分10
8秒前
9秒前
9秒前
9秒前
9秒前
在九月发布了新的文献求助10
10秒前
微笑的觅夏完成签到 ,获得积分10
10秒前
xien完成签到,获得积分10
10秒前
10秒前
晴天完成签到,获得积分10
10秒前
老迟到的秋完成签到,获得积分10
10秒前
11秒前
威威发布了新的文献求助10
11秒前
11秒前
丫丫完成签到,获得积分10
12秒前
传奇3应助zhang采纳,获得10
12秒前
13秒前
六日完成签到,获得积分10
13秒前
土豆条子发布了新的文献求助10
14秒前
星辰大海应助痴情的路灯采纳,获得10
14秒前
Lds发布了新的文献求助10
14秒前
高分求助中
Les Mantodea de Guyane Insecta, Polyneoptera 2500
Technologies supporting mass customization of apparel: A pilot project 450
China—Art—Modernity: A Critical Introduction to Chinese Visual Expression from the Beginning of the Twentieth Century to the Present Day 430
Tip60 complex regulates eggshell formation and oviposition in the white-backed planthopper, providing effective targets for pest control 400
A Field Guide to the Amphibians and Reptiles of Madagascar - Frank Glaw and Miguel Vences - 3rd Edition 400
China Gadabouts: New Frontiers of Humanitarian Nursing, 1941–51 400
The Healthy Socialist Life in Maoist China, 1949–1980 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3793765
求助须知:如何正确求助?哪些是违规求助? 3338643
关于积分的说明 10290816
捐赠科研通 3055026
什么是DOI,文献DOI怎么找? 1676315
邀请新用户注册赠送积分活动 804358
科研通“疑难数据库(出版商)”最低求助积分说明 761836