计算机科学
人工智能
深度学习
特征工程
水准点(测量)
特征(语言学)
机器学习
特征向量
编码(社会科学)
词(群论)
序列(生物学)
数据挖掘
数学
生物
几何学
语言学
地理
大地测量学
哲学
统计
遗传学
作者
Xinyu Yu,Lin Jiang,Shuting Jin,Xiangxiang Zeng,Xiangrong Liu
摘要
The interaction between microribonucleic acid and long non-coding ribonucleic acid plays a very important role in biological processes, and the prediction of the one is of great significance to the study of its mechanism of action. Due to the limitations of traditional biological experiment methods, more and more computational methods are applied to this field. However, the existing methods often have problems, such as inadequate acquisition of potential features of the sequence due to simple coding and the need to manually extract features as input. We propose a deep learning model, preMLI, based on rna2vec pre-training and deep feature mining mechanism. We use rna2vec to train the ribonucleic acid (RNA) dataset and to obtain the RNA word vector representation and then mine the RNA sequence features separately and finally concatenate the two feature vectors as the input of the prediction task. The preMLI performs better than existing methods on benchmark datasets and has cross-species prediction capabilities. Experiments show that both pre-training and deep feature mining mechanisms have a positive impact on the prediction performance of the model. To be more specific, pre-training can provide more accurate word vector representations. The deep feature mining mechanism also improves the prediction performance of the model. Meanwhile, The preMLI only needs RNA sequence as the input of the model and has better cross-species prediction performance than the most advanced prediction models, which have reference value for related research.
科研通智能强力驱动
Strongly Powered by AbleSci AI