亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整地填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

Genome-wide discovery of pre-miRNAs: comparison of recent approaches based on machine learning

基因组 智人 秀丽隐杆线虫 计算生物学 背景(考古学) 生物 计算机科学 黑腹果蝇 基因组学 遗传学 基因 人类学 社会学 古生物学
作者
Leandro A. Bugnon,Cristian Yones,Diego H. Milone,Georgina Stegmayer
出处
期刊:Briefings in Bioinformatics [Oxford University Press]
卷期号:22 (3) 被引量:18
标识
DOI:10.1093/bib/bbaa184
摘要

The genome-wide discovery of microRNAs (miRNAs) involves identifying sequences having the highest chance of being a novel miRNA precursor (pre-miRNA), within all the possible sequences in a complete genome. The known pre-miRNAs are usually just a few in comparison to the millions of candidates that have to be analyzed. This is of particular interest in non-model species and recently sequenced genomes, where the challenge is to find potential pre-miRNAs only from the sequenced genome. The task is unfeasible without the help of computational methods, such as deep learning. However, it is still very difficult to find an accurate predictor, with a low false positive rate in this genome-wide context. Although there are many available tools, these have not been tested in realistic conditions, with sequences from whole genomes and the high class imbalance inherent to such data.In this work, we review six recent methods for tackling this problem with machine learning. We compare the models in five genome-wide datasets: Arabidopsis thaliana, Caenorhabditis elegans, Anopheles gambiae, Drosophila melanogaster, Homo sapiens. The models have been designed for the pre-miRNAs prediction task, where there is a class of interest that is significantly underrepresented (the known pre-miRNAs) with respect to a very large number of unlabeled samples. It was found that for the smaller genomes and smaller imbalances, all methods perform in a similar way. However, for larger datasets such as the H. sapiens genome, it was found that deep learning approaches using raw information from the sequences reached the best scores, achieving low numbers of false positives.The source code to reproduce these results is in: http://sourceforge.net/projects/sourcesinc/files/gwmirna Additionally, the datasets are freely available in: https://sourceforge.net/projects/sourcesinc/files/mirdata.

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
牛牛发布了新的文献求助10
2秒前
3秒前
axiao发布了新的文献求助10
10秒前
13秒前
共享精神应助axiao采纳,获得10
16秒前
美美发布了新的文献求助10
21秒前
Ujjel75发布了新的文献求助10
24秒前
26秒前
咎不可完成签到,获得积分10
26秒前
何88888888发布了新的文献求助10
27秒前
阿拉发布了新的文献求助10
31秒前
美美完成签到,获得积分10
36秒前
搜集达人应助阿拉采纳,获得10
40秒前
李爱国应助牛牛采纳,获得10
51秒前
yh完成签到,获得积分10
58秒前
李健的小迷弟应助velen采纳,获得30
1分钟前
1分钟前
牛牛发布了新的文献求助10
1分钟前
1分钟前
方文发布了新的文献求助10
1分钟前
1分钟前
阿拉发布了新的文献求助10
1分钟前
1分钟前
静静发布了新的文献求助10
2分钟前
NexusExplorer应助礼拜一采纳,获得80
2分钟前
科研通AI6.3应助静静采纳,获得10
2分钟前
丘比特应助aco采纳,获得10
2分钟前
2分钟前
2分钟前
aco发布了新的文献求助10
2分钟前
礼拜一发布了新的文献求助80
2分钟前
wanci应助牛牛采纳,获得10
2分钟前
JamesPei应助旧残月采纳,获得10
2分钟前
2分钟前
牛牛发布了新的文献求助10
2分钟前
酷波er应助Ujjel75采纳,获得10
2分钟前
2分钟前
2分钟前
旧残月发布了新的文献求助10
3分钟前
3分钟前
高分求助中
(应助此贴封号)【重要!!请各用户(尤其是新用户)详细阅读】【科研通的精品贴汇总】 10000
Modern Epidemiology, Fourth Edition 5000
Handbook of pharmaceutical excipients, Ninth edition 5000
Digital Twins of Advanced Materials Processing 2000
Weaponeering, Fourth Edition – Two Volume SET 2000
Polymorphism and polytypism in crystals 1000
Signals, Systems, and Signal Processing 610
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 纳米技术 化学工程 生物化学 物理 计算机科学 内科学 复合材料 催化作用 物理化学 光电子学 电极 冶金 细胞生物学 基因
热门帖子
关注 科研通微信公众号,转发送积分 6021087
求助须知:如何正确求助?哪些是违规求助? 7627056
关于积分的说明 16166128
捐赠科研通 5168889
什么是DOI,文献DOI怎么找? 2766181
邀请新用户注册赠送积分活动 1748805
关于科研通互助平台的介绍 1636261