已入深夜,您辛苦了!由于当前在线用户较少,发布求助请尽量完整地填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!祝你早点完成任务,早点休息,好梦!

SNooPer: a machine learning-based method for somatic variant identification from low-pass next-generation sequencing

生物 DNA测序 深度测序 外显子组测序 外显子组 计算生物学 大规模并行测序 一致性 基因组 遗传学 计算机科学 突变 基因
作者
Jean-François Spinella,Pamela Mehanna,Ramón Vidal,Virginie Saillour,Pauline Cassart,Chantal Richer,Manon Ouimet,Jasmine Healy,Daniel Sinnett
出处
期刊:BMC Genomics [Springer Nature]
卷期号:17 (1) 被引量:65
标识
DOI:10.1186/s12864-016-3281-2
摘要

Next-generation sequencing (NGS) allows unbiased, in-depth interrogation of cancer genomes. Many somatic variant callers have been developed yet accurate ascertainment of somatic variants remains a considerable challenge as evidenced by the varying mutation call rates and low concordance among callers. Statistical model-based algorithms that are currently available perform well under ideal scenarios, such as high sequencing depth, homogeneous tumor samples, high somatic variant allele frequency (VAF), but show limited performance with sub-optimal data such as low-pass whole-exome/genome sequencing data. While the goal of any cancer sequencing project is to identify a relevant, and limited, set of somatic variants for further sequence/functional validation, the inherently complex nature of cancer genomes combined with technical issues directly related to sequencing and alignment can affect either the specificity and/or sensitivity of most callers. For these reasons, we developed SNooPer, a versatile machine learning approach that uses Random Forest classification models to accurately call somatic variants in low-depth sequencing data. SNooPer uses a subset of variant positions from the sequencing output for which the class, true variation or sequencing error, is known to train the data-specific model. Here, using a real dataset of 40 childhood acute lymphoblastic leukemia patients, we show how the SNooPer algorithm is not affected by low coverage or low VAFs, and can be used to reduce overall sequencing costs while maintaining high specificity and sensitivity to somatic variant calling. When compared to three benchmarked somatic callers, SNooPer demonstrated the best overall performance. While the goal of any cancer sequencing project is to identify a relevant, and limited, set of somatic variants for further sequence/functional validation, the inherently complex nature of cancer genomes combined with technical issues directly related to sequencing and alignment can affect either the specificity and/or sensitivity of most callers. The flexibility of SNooPer's random forest protects against technical bias and systematic errors, and is appealing in that it does not rely on user-defined parameters. The code and user guide can be downloaded at https://sourceforge.net/projects/snooper/ .

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
PDF的下载单位、IP信息已删除 (2025-6-4)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
建议保存本图,每天支付宝扫一扫(相册选取)领红包
实时播报
彭于晏应助sssyyy采纳,获得10
1秒前
2秒前
少一点西红柿完成签到 ,获得积分10
3秒前
4秒前
闪闪小小完成签到 ,获得积分10
7秒前
9秒前
加菲丰丰发布了新的文献求助200
9秒前
范高烽完成签到,获得积分20
10秒前
安详凡发布了新的文献求助10
14秒前
付珊珊完成签到,获得积分20
15秒前
小梅超顺利完成签到 ,获得积分10
15秒前
15秒前
缓冲中完成签到 ,获得积分10
16秒前
wanci应助鳗鱼绿蝶采纳,获得10
17秒前
完美世界应助dancingidam采纳,获得10
18秒前
艾米发布了新的文献求助10
19秒前
painx完成签到,获得积分10
20秒前
20秒前
无花果应助科研通管家采纳,获得10
21秒前
CipherSage应助科研通管家采纳,获得10
21秒前
深情安青应助科研通管家采纳,获得10
21秒前
乐乐应助淡定的冰巧采纳,获得10
21秒前
大龙哥886应助科研通管家采纳,获得10
21秒前
浮游应助科研通管家采纳,获得10
21秒前
Hello应助科研通管家采纳,获得10
21秒前
浮游应助科研通管家采纳,获得10
21秒前
烟花应助科研通管家采纳,获得10
21秒前
852应助科研通管家采纳,获得10
21秒前
21秒前
21秒前
希希完成签到 ,获得积分10
23秒前
Orange应助冷傲的从雪采纳,获得10
25秒前
25秒前
26秒前
Orange应助付珊珊采纳,获得10
26秒前
鳗鱼绿蝶发布了新的文献求助10
28秒前
29秒前
CipherSage应助神医magical采纳,获得10
29秒前
科研通AI6应助darkpigx采纳,获得30
29秒前
孙凯应助Tal采纳,获得20
32秒前
高分求助中
Learning and Memory: A Comprehensive Reference 2000
Predation in the Hymenoptera: An Evolutionary Perspective 1800
List of 1,091 Public Pension Profiles by Region 1541
The Jasper Project 800
Holistic Discourse Analysis 600
Beyond the sentence: discourse and sentential form / edited by Jessica R. Wirth 600
Binary Alloy Phase Diagrams, 2nd Edition 600
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 生物化学 物理 纳米技术 计算机科学 内科学 化学工程 复合材料 物理化学 基因 遗传学 催化作用 冶金 量子力学 光电子学
热门帖子
关注 科研通微信公众号,转发送积分 5502320
求助须知:如何正确求助?哪些是违规求助? 4598287
关于积分的说明 14463306
捐赠科研通 4531820
什么是DOI,文献DOI怎么找? 2483641
邀请新用户注册赠送积分活动 1466923
关于科研通互助平台的介绍 1439539