康蒂格
德布鲁恩序列
天鹅绒
德布鲁因图
生物
k-mer公司
顺序装配
算法
计算机科学
计算生物学
遗传学
组合数学
DNA测序
基因组
数学
基因
基因表达
转录组
有机化学
化学
作者
Daniel R. Zerbino,Ewan Birney
出处
期刊:Genome Research
[Cold Spring Harbor Laboratory Press]
日期:2008-03-18
卷期号:18 (5): 821-829
被引量:9446
标识
DOI:10.1101/gr.074492.107
摘要
We have developed a new set of algorithms, collectively called “Velvet,” to manipulate de Bruijn graphs for genomic sequence assembly. A de Bruijn graph is a compact representation based on short words ( k -mers) that is ideal for high coverage, very short read (25–50 bp) data sets. Applying Velvet to very short reads and paired-ends information only, one can produce contigs of significant length, up to 50-kb N50 length in simulations of prokaryotic data and 3-kb N50 on simulated mammalian BACs. When applied to real Solexa data sets without read pairs, Velvet generated contigs of ∼8 kb in a prokaryote and 2 kb in a mammalian BAC, in close agreement with our simulated results without read-pair information. Velvet represents a new approach to assembly that can leverage very short reads in combination with read pairs to produce useful assemblies.
科研通智能强力驱动
Strongly Powered by AbleSci AI