生物
基因组
小麦族
普通大麦
顺序装配
序列(生物学)
杂交基因组组装
计算生物学
全基因组测序
参考基因组
k-mer公司
遗传学
基因
植物
转录组
禾本科
基因表达
作者
Martin Mascher,Thomas Wicker,Jerry Jenkins,Christopher Plott,Thomas Lux,ChuShin Koh,Jennifer Ens,Heidrun Gundlach,Lori Beth Boston,Zuzana Tulpová,Samuel Holden,Inmaculada Hernández‐Pinzón,Uwe Scholz,Klaus Mayer,M. Spannagl,Curtis Pozniak,Andrew Sharpe,Hana Šimková,Matthew Moscou,Jane Grimwood
出处
期刊:The Plant Cell
[Oxford University Press]
日期:2021-03-10
卷期号:33 (6): 1888-1906
被引量:396
标识
DOI:10.1093/plcell/koab077
摘要
Sequence assembly of large and repeat-rich plant genomes has been challenging, requiring substantial computational resources and often several complementary sequence assembly and genome mapping approaches. The recent development of fast and accurate long-read sequencing by circular consensus sequencing (CCS) on the PacBio platform may greatly increase the scope of plant pan-genome projects. Here, we compare current long-read sequencing platforms regarding their ability to rapidly generate contiguous sequence assemblies in pan-genome studies of barley (Hordeum vulgare). Most long-read assemblies are clearly superior to the current barley reference sequence based on short-reads. Assemblies derived from accurate long reads excel in most metrics, but the CCS approach was the most cost-effective strategy for assembling tens of barley genomes. A downsampling analysis indicated that 20-fold CCS coverage can yield very good sequence assemblies, while even five-fold CCS data may capture the complete sequence of most genes. We present an updated reference genome assembly for barley with near-complete representation of the repeat-rich intergenic space. Long-read assembly can underpin the construction of accurate and complete sequences of multiple genomes of a species to build pan-genome infrastructures in Triticeae crops and their wild relatives.
科研通智能强力驱动
Strongly Powered by AbleSci AI