发布文献求助

Explainable Transcription Factor Prediction with Protein Language Models

计算机科学人工智能转录因子因子（编程语言）自然语言处理计算生物学基因程序设计语言生物遗传学

作者

Liyuan Gao,K.-H. Shu,Jun Zhang,Victor S. Sheng

标识

DOI：10.1109/bibm58861.2023.10385498

摘要

Language models have exhibited remarkable performance across diverse tasks, including those in the realm of biological research such as protein language modeling. Transcription factors (TFs) are pivotal in gene regulation, influencing gene expression through specific DNA sequence binding. While various TF prediction techniques exist, they often necessitate extensive training datasets or suffer from limited accuracy. In this study, we propose an ESM-TFpredict model, which leverages a pre-trained protein language model to encode amino acid sequences, followed by 1-D convolutional neural networks for TF prediction. To elucidate the model's decision-making, we employ an integrated gradients method to highlight the important features driving TF identification. Comparative experimental analysis with existing models, DeepTFactor and TFpredict, reveals that the ESM-TFpredict achieves an accuracy exceeding 95% across four evaluation metrics, surpassing both competitors. By utilizing a slide window approach for protein representation compression, the training duration of ESM-TFpredict is 315.78 seconds, which is only 51% of the training time required by DeepTFactor and a mere 12% of the training time required by TFpredict. We further analyze the contributions of known TF-related regions (average attribution score 0.9152) versus Non-TF-related regions (average attribution score 0.0848), demonstrating that the TF-related regions have dominant influences on TF prediction.

求助该文献

最长约 10秒，即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI

我的文献求助列表浏览历史

一分钟了解求助规则 | 捐赠本站 | 历史今天

活动

『应助活动周』获奖名单已公布 🔥 (2025-4-2)

更新

『中科院2025期刊分区』已更新 (2025-3-23)

更新

『即时热点』模块已上线 (2025-2-28)

科研通是完全免费的文献互助平台，具备全网最快的应助速度，最高的求助完成率。对每一个文献求助，科研通都将尽心尽力，给求助人一个满意的交代。

实时播报: 1364135702完成签到，获得积分10

3秒前; lianqing完成签到，获得积分10

3秒前; 爆米花的应助被落寞凌柏采纳，获得10

3秒前; 动漫大师发布了新的文献求助10

5秒前; xzy998上传了应助文件

5秒前; 燕聪聪驳回了Hello的应助

6秒前; 小天发布了新的文献求助10

7秒前; 善学以致用的应助被liaomr采纳，获得10

7秒前; 旋转木马9个完成签到，获得积分10

8秒前; cannon8的应助被智智采纳，获得20

9秒前; 单纯乞完成签到，获得积分10

10秒前; 万能图书馆上传了应助文件

11秒前; 余晖霞光完成签到，获得积分10

11秒前; 小天完成签到，获得积分10

14秒前; 雪白的紫翠上传了应助文件

15秒前; 搞怪的紫雪完成签到，获得积分10

15秒前; Quentin9998发布了新的文献求助10

15秒前; lalala发布了新的文献求助10

15秒前; 李爱国上传了应助文件

16秒前; 新世界的蜗牛完成签到，获得积分10

20秒前; Owen上传了应助文件

20秒前; 所所的应助被通义千问采纳，获得10

21秒前; hss发布了新的文献求助10

22秒前; VDC完成签到，获得积分0

22秒前; zhoujy完成签到，获得积分10

22秒前; 浑天与发布了新的文献求助10

24秒前; 风趣的梦露发布了新的文献求助10

26秒前; 所所上传了应助文件

28秒前; 槿裡完成签到，获得积分10

31秒前; 浑天与完成签到，获得积分10

31秒前; wshwx完成签到，获得积分10

32秒前; 通义千问发布了新的文献求助10

34秒前; 共享精神的应助被风趣的梦露采纳，获得10

35秒前; 理想三寻完成签到，获得积分10

37秒前; 通义千问完成签到，获得积分10

41秒前; CodeCraft的应助被ljljljlj采纳，获得10

42秒前; HEIKU完成签到，获得积分0

43秒前; adazbq完成签到，获得积分10

44秒前; 小王完成签到，获得积分10

46秒前; 雪白的紫翠的应助被xyzlancet采纳，获得10

49秒前

高分求助中: 【此为提示信息，请勿应助】请按要求发布求助，避免被关 20000; Technologies supporting mass customization of apparel: A pilot project 450; Mixing the elements of mass customisation 360; Периодизация спортивной тренировки. Общая теория и её практическое применение 310; the MD Anderson Surgical Oncology Manual, Seventh Edition 300; Nucleophilic substitution in azasydnone-modified dinitroanisoles 300; Political Ideologies Their Origins and Impact 13th Edition 260

热门求助领域（近24小时）

热门帖子: 关注科研通微信公众号，转发送积分 3780920; 求助须知：如何正确求助？哪些是违规求助？ 3326387; 关于积分的说明 10227030; 捐赠科研通 3041612; 什么是DOI，文献DOI怎么找？ 1669520; 邀请新用户注册赠送积分活动 799081; 科研通“疑难数据库（出版商）”最低求助积分说明 758734

今日热心研友

平常的毛豆

忐忑的黑猫

善良的剑通

今天只做一件事

注：热心度 = 本日应助数 + 本日被采纳获取积分÷10

Copyright © 2020-2025 AbleSci.COM, 科研通, All Right Reserved

科研通是非营利科研互助平台，不忘初心，为科研助力

本站互助的所有文件仅供个人学习研究用，禁止任何人把求助的所得文献进行盈利或传播

皖ICP备2024041134号-1

皖公网安备34019202002308

科研通【文献互助QQ群】：如果您有特殊求助，或发布求助超过24小时未得到应助，可加群求助，群号：941272744【点击一键加群】

科研通【志愿服务QQ群】：如果您热爱文献互助，有热心愿意为更多人服务，请加入小伙伴群，点击申请加入

关注微信服务号

科研通