深度学习
计算机科学
人工智能
端到端原则
源代码
机器学习
特征工程
数据挖掘
计算生物学
作者
Jonathan Raad,Leandro A Bugnon,Diego H Milone,Georgina Stegmayer
出处
期刊:Bioinformatics
[Oxford University Press]
日期:2021-12-07
卷期号:38 (5): 1191-1197
标识
DOI:10.1093/bioinformatics/btab823
摘要
MicroRNAs (miRNAs) are small RNA sequences with key roles in the regulation of gene expression at post-transcriptional level in different species. Accurate prediction of novel miRNAs is needed due to their importance in many biological processes and their associations with complicated diseases in humans. Many machine learning approaches were proposed in the last decade for this purpose, but requiring handcrafted features extraction in order to identify possible de novo miRNAs. More recently, the emergence of deep learning has allowed the automatic feature extraction, learning relevant representations by themselves. However, the state-of-art deep models require complex pre-processing of the input sequences and prediction of their secondary structure in order to reach an acceptable performance.In this work we present miRe2e, the first full end-to-end deep learning model for pre-miRNA prediction. This model is based on Transformers, a neural architecture that uses attention mechanisms to infer global dependencies between inputs and outputs. It is capable of receiving the raw genome-wide data as input, without any pre-processing nor feature engineering. After a training stage with known pre-miRNAs, hairpin and non-harpin sequences, it can identify all the pre-miRNA sequences within a genome. The model has been validated through several experimental setups using the human genome, and it was compared with state-of-the-art algorithms obtaining 10 times better performance.Webdemo available at https://sinc.unl.edu.ar/web-demo/miRe2e/ and source code available for download at https://github.com/sinc-lab/miRe2e.Supplementary data are available at Bioinformatics online.
科研通智能强力驱动
Strongly Powered by AbleSci AI