B-151 SelfTrans-Ensemble: Aptamer-protein interaction prediction model based on transformer

适体 变压器 计算机科学 计算生物学 人工智能 分子生物学 生物 工程类 电气工程 电压
作者
Buyong Ma,Zhichao Yan,Yue Kang
出处
期刊:Clinical Chemistry [Oxford University Press]
卷期号:71 (Supplement_1)
标识
DOI:10.1093/clinchem/hvaf086.545
摘要

Abstract Background Aptamers has drawn significant attention in light of the emerging prominence of nucleic acid-based therapeutics and diagnosis. Aptamers are single-stranded oligonucleotides or short peptides characterized by a distinctive three-dimensional architecture comprising of 20 to 100 nucleotides (nt). They exhibit high affinity and specificity towards target molecules. They have great potential in the detection and medical fields. The SELEX technique is an empirical experimental method. Aptamers obtained by this method are often time-consuming to produce and may have low affinity. With the development of computational technology, artificial intelligence algorithms have demonstrated excellent performance in the field of nucleic acids. Several machine learning approaches have published to predict protein-aptamer interaction. Methods Here, we present SelfTrans-Ensemble, a deep learning model that integrates sequence information models and structural information models to extract multi-scale features for predicting aptamer-protein interactions (APIs). The model employs two pre-trained models, ProtBert and RNA-FM, to encode protein and aptamer sequences, along with features generated from primary sequence and secondary structural information. To address the data imbalance in the aptamer dataset imbalance, we incorporated short RNA-protein interaction data in the training set. Results We have compiled a dataset consists of 1422 aptamer/RNA sequences and 848 protein sequences, for a total of 1934 aptamer/RNA-protein interaction entries. Our model resulted in a training accuracy of 98.9% and a test accuracy of 88.0%, demonstrating the model*s effectiveness in accurately predicting APIs. We investigated the attention learned for aptamer and protein sequences to explore the enabling residue/nucleotides for APIs, and evaluate if the applied transformer-based network is capable to capture the short-range and long-range dependencies efficiently for aptamer and protein sequences. For a DNA aptamer binding to Von Willebrand Factor (VWF, PDB 3HXO), we found that the attention layer strongly associate with binding correlation, which is consistent with previous structural analyses. Additionally, analysis using molecular simulation indicated that SelfTrans-Ensemble is sensitive to aptamer sequence mutations. Conclusion SelfTrans-Ensemble exhibits an F1 score of 0.896 and an AUC of 0.9232, indicating that the model is capable of effectively predicting APIs. We further explored the sensitivity of the model by assessing its response to double mutations in RNA sequences and found that the transformer-based model is capable of capturing small mutations in sequences,providing insights of the model*s applicability to facilitate RNA design approach aimed at targeting specific proteins. Our approach holds potential to serve as a rapid and reliable screening approach for binding aptamer sequences towards target proteins, improving the cost-effectiveness and efficiency of SELEX in aptamer screening.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
coco发布了新的文献求助10
刚刚
maomao完成签到,获得积分10
刚刚
1秒前
小二郎应助一碗鱼采纳,获得10
2秒前
2秒前
量子星尘发布了新的文献求助10
3秒前
zy990125发布了新的文献求助50
4秒前
yhy发布了新的文献求助10
4秒前
narthon完成签到 ,获得积分10
5秒前
meimei完成签到 ,获得积分10
6秒前
7秒前
勤恳的画笔完成签到 ,获得积分10
7秒前
学术辉发布了新的文献求助10
7秒前
10秒前
Cooper应助coco采纳,获得10
10秒前
希望天下0贩的0应助yhy采纳,获得10
11秒前
自然垣发布了新的文献求助10
11秒前
量子星尘发布了新的文献求助10
12秒前
科目三应助Infinit采纳,获得10
12秒前
科研通AI6.1应助自然白安采纳,获得10
12秒前
一碗鱼发布了新的文献求助10
14秒前
xuan完成签到 ,获得积分10
14秒前
15秒前
17秒前
量子星尘发布了新的文献求助10
17秒前
jinrihui1完成签到,获得积分10
18秒前
传奇3应助PPRer采纳,获得10
20秒前
杨佳酿完成签到,获得积分10
20秒前
Orange应助自然垣采纳,获得10
20秒前
忍冬发布了新的文献求助10
22秒前
yhy完成签到,获得积分10
23秒前
23秒前
23秒前
zy990125完成签到,获得积分10
23秒前
24秒前
GD完成签到,获得积分10
25秒前
25秒前
Lucas应助糟糕的铁锤采纳,获得30
25秒前
lili完成签到 ,获得积分10
28秒前
xuanxuan完成签到 ,获得积分10
28秒前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Encyclopedia of Quaternary Science Reference Third edition 6000
Encyclopedia of Forensic and Legal Medicine Third Edition 5000
Introduction to strong mixing conditions volume 1-3 5000
Aerospace Engineering Education During the First Century of Flight 3000
Agyptische Geschichte der 21.30. Dynastie 3000
Les Mantodea de guyane 2000
热门求助领域 (近24小时)
化学 材料科学 生物 医学 工程类 计算机科学 有机化学 物理 生物化学 纳米技术 复合材料 内科学 化学工程 人工智能 催化作用 遗传学 数学 基因 量子力学 物理化学
热门帖子
关注 科研通微信公众号,转发送积分 5785617
求助须知:如何正确求助?哪些是违规求助? 5689060
关于积分的说明 15468007
捐赠科研通 4914681
什么是DOI,文献DOI怎么找? 2645337
邀请新用户注册赠送积分活动 1593128
关于科研通互助平台的介绍 1547432