公制(单位)
水准点(测量)
人工智能
计算机科学
管道(软件)
源代码
RNA结合蛋白
深度学习
成对比较
计算生物学
人工神经网络
学习迁移
机器学习
序列(生物学)
生物
基因
核糖核酸
遗传学
工程类
运营管理
大地测量学
程序设计语言
地理
操作系统
作者
Hehe Wu,Xiaoyong Pan,Yang Yang,Hong‐Bin Shen
摘要
Circular RNAs (circRNAs) interact with RNA-binding proteins (RBPs) to play crucial roles in gene regulation and disease development. Computational approaches have attracted much attention to quickly predict highly potential RBP binding sites on circRNAs using the sequence or structure statistical binding knowledge. Deep learning is one of the popular learning models in this area but usually requires a lot of labeled training data. It would perform unsatisfactorily for the less characterized RBPs with a limited number of known target circRNAs. How to improve the prediction performance for such small-size labeled characterized RBPs is a challenging task for deep learning-based models. In this study, we propose an RBP-specific method iDeepC for predicting RBP binding sites on circRNAs from sequences. It adopts a Siamese neural network consisting of a lightweight attention module and a metric module. We have found that Siamese neural network effectively enhances the network capability of capturing mutual information between circRNAs with pairwise metric learning. To further deal with the small-sample size problem, we have performed the pretraining using available labeled data from other RBPs and also demonstrate the efficacy of this transfer-learning pipeline. We comprehensively evaluated iDeepC on the benchmark datasets of RBP-binding circRNAs, and the results suggest iDeepC achieving promising results on the poorly characterized RBPs. The source code is available at https://github.com/hehew321/iDeepC.
科研通智能强力驱动
Strongly Powered by AbleSci AI