德布鲁因图
算法
计算机科学
图形
水准点(测量)
理论计算机科学
大地测量学
地理
作者
Zeinab Zare Hosseini,Shekoufeh Kolahdouz-Rahimi,Esmaeil Forouzan,Ahmad Baraani
标识
DOI:10.1142/s0219720021500050
摘要
The de Bruijn Graph algorithm (DBG) as one of the cornerstones algorithms in short read assembly has extended with the rapid advancement of the Next Generation Sequencing (NGS) technologies and low-cost production of millions of high-quality short reads. Erroneous reads, non-uniform coverage, and genomic repeats are three major problems that influence the performance of short read assemblers. To encounter these problems, the iterative DBG algorithm applies multiple [Formula: see text]-mers instead of a single [Formula: see text]-mer, by iterating the DBG graph over a range of [Formula: see text]-mer sizes from the minimum to the maximum. However, the iteration paradigm of iterative DBG deals with complex graphs from the beginning of the algorithm and therefore, causes more potential errors and computational time for resolving various unreal branches. In this research, we propose the Reverse Modified Iterative DBG graph (named RMI-DBG) for short read assembly. RMI-DBG utilizes the DBG algorithm and String graph to achieve the advantages of both algorithms. We present that RMI-DBG performs faster with comparable results in comparison to iterative DBG. Additionally, the quality of the proposed algorithm in terms of continuity and accuracy is evaluated with some commonly-used assemblers via several real datasets of the GAGE-B benchmark.
科研通智能强力驱动
Strongly Powered by AbleSci AI