顺序装配
k-mer公司
计算机科学
弦(物理)
集合(抽象数据类型)
基因组
吞吐量
错误检测和纠正
序列(生物学)
算法
参考基因组
索引(排版)
国家(计算机科学)
计算生物学
生物
数学
遗传学
电信
基因
程序设计语言
基因表达
转录组
无线
数学物理
作者
James Holt,Jeremy Wang,Corbin D. Jones,Leonard McMillan
摘要
1 Abstract Long read sequencing is changing the landscape of genomic research, especially de novo assembly. Despite the high error rate inherent to long read technologies, increased read lengths dramatically improve the continuity and accuracy of genome assemblies. However, the cost and throughput of these technologies limits their application to complex genomes. One solution is to decrease the cost and time to assemble novel genomes by leveraging “hybrid” assemblies that use long reads for scaffolding and short reads for accuracy. To this end, we describe a novel application of a multi-string Burrows-Wheeler Transform with auxiliary FM-index to correct errors in long read sequences using a set of complementary short reads. We show that our method efficiently produces significantly higher quality corrected sequence than existing hybrid error-correction methods. We demonstrate the effectiveness of our method compared to state-of-the-art hybrid and long-read only de novo assembly methods.
科研通智能强力驱动
Strongly Powered by AbleSci AI