清晨好,您是今天最早来到科研通的研友!由于当前在线用户较少,发布求助请尽量完整的填写文献信息,科研通机器人24小时在线,伴您科研之路漫漫前行!

The gap-free potato genome assembly reveals large tandem gene clusters of agronomical importance in highly repeated genomic regions

生物 基因组 顺序装配 串联 基因 计算生物学 遗传学 生物技术 进化生物学 基因表达 复合材料 转录组 材料科学
作者
Xiaohui Yang,Lingkui Zhang,Xuewu Guo,Jianfei Xu,Kang Zhang,Yinqing Yang,Yang Yu,Yinqiao Jian,Dan Dong,Sanwen Huang,Feng Cao,Guangcun Li
出处
期刊:Molecular Plant [Elsevier]
卷期号:16 (2): 314-317 被引量:8
标识
DOI:10.1016/j.molp.2022.12.010
摘要

Potato is a vital food security crop and is ranked as the world’s third most important food crop after rice and wheat. In 2011, the first genome assembly of a doubled monoploid potato DM1-3 516 R44 (DM) was released (Potato Genome Sequencing Consortium, 2011Potato Genome Sequencing ConsortiumGenome sequence and analysis of the tuber crop potato.Nature. 2011; 475: 189-195Crossref PubMed Scopus (1511) Google Scholar), which has been widely used as one of the most popular reference genomes in the last decade and served as a valuable resource in plant genomics and potato genetics community (Leisner et al., 2018Leisner C.P. Hamilton J.P. Crisovan E. Manrique-Carpintero N.C. Marand A.P. Newton L. Pham G.M. Jiang J. Douches D.S. Jansky S.H. et al.Genome sequence of M6, a diploid inbred clone of the high-glycoalkaloid-producing tuber-bearing potato species Solanum chacoense, reveals residual heterozygosity.Plant J. 2018; 94: 562-570Crossref PubMed Scopus (91) Google Scholar; Yang et al., 2020Yang X. Yang Y. Ling J. Guan J. Guo X. Dong D. Jin L. Huang S. Liu J. Li G. A high-throughput BAC end analysis protocol (BAC-anchor) for profiling genome assembly and physical mapping.Plant Biotechnol. J. 2020; 18: 364-372Crossref PubMed Scopus (3) Google Scholar; Zheng et al., 2020Zheng J. Yang Y. Guo X. Jin L. Xiong X. Yang X. Li G. Exogenous SA initiated defense response and multi-signaling pathway in tetraploid potato SD20.Horticultural Plant Journal. 2020; 6: 99-110Crossref Scopus (16) Google Scholar). The latest version of DM genome assembly (v6.1) (Pham et al., 2020Pham G.M. Hamilton J.P. Wood J.C. Burke J.T. Zhao H. Vaillancourt B. Ou S. Jiang J. Buell C.R. Construction of a chromosome-scale long-read reference genome assembly for potato.GigaScience. 2020; 9: giaa100-giaa111Crossref PubMed Scopus (96) Google Scholar) served as a good reference and quality control in studies of diploid and tetraploid potatoes (Zhou et al., 2020Zhou Q. Tang D. Huang W. Yang Z. Zhang Y. Hamilton J.P. Visser R.G.F. Bachem C.W.B. Robin Buell C. Zhang Z. et al.Haplotype-resolved genome analyses of a heterozygous diploid potato.Nat. Genet. 2020; 52: 1018-1023Crossref PubMed Scopus (96) Google Scholar; Bao et al., 2022Bao Z. Li C. Li G. Wang P. Peng Z. Cheng L. Li H. Zhang Z. Li Y. Huang W. et al.Genome architecture and tetrasomic inheritance of autotetraploid potato.Mol. Plant. 2022; 15: 1211-1226Abstract Full Text Full Text PDF PubMed Scopus (11) Google Scholar; Hoopes et al., 2022Hoopes G. Meng X. Hamilton J.P. Achakkagari S.R. de Alves Freitas Guesdes F. Bolger M.E. Coombs J.J. Esselink D. Kaiser N.R. Kodde L. et al.Phased, chromosome-scale genome assemblies of tetraploid potato reveal a complex genome, transcriptome, and predicted proteome landscape underpinning genetic diversity.Mol. Plant. 2022; 15: 520-536Abstract Full Text Full Text PDF PubMed Scopus (42) Google Scholar; Sun et al., 2022Sun H. Jiao W.B. Krause K. Campoy J.A. Goel M. Folz-Donahue K. Kukat C. Huettel B. Schneeberger K. Chromosome-scale and haplotype-resolved genome assembly of a tetraploid potato cultivar.Nat. Genet. 2022; 54: 342-348Crossref PubMed Scopus (45) Google Scholar; Tang et al., 2022Tang D. Jia Y. Zhang J. Li H. Cheng L. Wang P. Bao Z. Liu Z. Feng S. Zhu X. et al.Genome evolution and diversity of wild and cultivated potatoes.Nature. 2022; 606: 535-541Crossref PubMed Scopus (62) Google Scholar). However, 161 gaps remain in DM6.1 (v6.1), and the centromere and telomere structures are incomplete. Considering the importance of the DM genome in potato genomics, genetics, and breeding studies, generating a complete genome assembly of DM is of great importance. In this study, a telomere-to-telomere gap-free genome of DM (DM8.1) (Figure 1A) was assembled through combining Oxford Nanopore Technologies (ONT) ultra-long reads sequencing (119.81× coverage) and Hi-C sequencing (130.57×) (Supplemental Table 1), as well as being assisted by multiple gap-closing strategies coupled with high fidelity (HIFI) reads from circular consensus sequencing. A total of 179 contigs with a summed size of 773.36 Mb and a contig N50 of 59.72 Mb were obtained after initial genome assembly, polishing, and decontamination. Hi-C reads further anchored 37 of the 179 contigs into 12 chromosomes (Supplemental Figure 1; Supplemental Table 2), accounting for 95.53% (738.82 Mb) of the total assembly, and we named it preDM8. For the 142 (34.53 Mb) unanchored contigs, over 98% are short sequences (<1 Mb), and all could be aligned to chromosomes with high similarity, indicating that these were repetitive or redundant sequences. The preDM8 has better contiguous sequences than DM6.1 and the potato pan-genome assemblies (Tang et al., 2022Tang D. Jia Y. Zhang J. Li H. Cheng L. Wang P. Bao Z. Liu Z. Feng S. Zhu X. et al.Genome evolution and diversity of wild and cultivated potatoes.Nature. 2022; 606: 535-541Crossref PubMed Scopus (62) Google Scholar) (Supplemental Figure 2). However, there were 25 gaps in preDM8. Three methods were further adopted to close these gaps (Supplemental Figure 3A; Supplemental Table 3). First, we aligned the ONT reads to preDM8, and reads mapped on the flanking regions of gaps were collected and assembled, which successfully closed 14 gaps. Second, based on the syntenic homologous fragments between preDM8 and DM6.1, three gaps were closed with the DM6.1 consecutive sequences that covered these gaps in preDM8. Third, target sequences amplification experiments (Supplemental Figure 3B) and HIFI sequencing were performed, which successfully closed the remaining eight gaps (Supplemental Figures 3C and 4). Finally, we generated the gap-free genome assembly of DM and named it DM8.1 (Figure 1A; Supplemental Table 4). To verify the quality of the gap-free genome, we investigated the reliability of these sequences in DM8.1 that corresponded to the 161 gaps in DM6.1. We randomly selected 50 of the 161 gaps and designed 100 pairs of primers (Supplemental Table 5) based on sequences on both sides of these closed gaps for PCR amplification (Supplemental Figure 5) and Sanger sequencing. Both the 5′ and 3′ boundary sequences of these gaps were successfully obtained, which indicated the high accuracy of DM8.1. Meanwhile, DM8.1 genome achieved a BUSCO value of 98.70%, an extremely high mapping rate (>99.90%) of both Illumina short reads and ONT long reads; a high consensus quality value (35.85) obtained by Merqury analysis; and improvement in long terminal repeat (LTR)-retrotransposon completeness (DM8.1: LAI = 12.92, LTR length = 388.58 Mb; DM6.1: LAI = 12.75, LTR length = 375.91 Mb), further supporting the high quality of DM8.1 (Supplemental Tables 6 and 7). A total of 40 155 protein-coding genes were predicted in DM8.1 (Supplemental Figure 6), among which 33 972 (84.60%) were functionally annotated and 24 362 genes were expressed, estimated by the 10 mRNA sequencing datasets. Further analysis found that there were 1117 genes in DM8.1 that were mis-annotated in DM6.1 in that one gene was incorrectly annotated as two. These errors were revealed by individual read pairs (mRNA sequencing) covering and linking two mis-annotated neighbor genes, suggesting that they were from a transcript of one gene (Supplemental Figure 7). Meanwhile, a total of 956 349 transposable elements (TEs) were predicted, accounting for 60.31% (465.81 Mb) of the DM8.1 genome (Supplemental Figure 8; Supplemental Table 8). Additionally, there were 4676 small RNAs predicted in DM8.1 (Supplemental Figure 9). All telomere regions were detected in DM8.1 using the seven-base telomeric repeat and sub-telomeric repeats of CL14 and CL34, and all centromere regions were identified using CENH3 (Figure 1A). Sequence composition analysis showed that the centromere regions contained more Gypsy-type LTRs (49.25%), while the telomere regions harbored more unknown TEs (Supplemental Figure 8). Additionally, the filled sequences in these 25 gaps showed similar TE contents to the centromere regions (Supplemental Figure 8). The complete genome assembly of DM8.1 facilitated the identification of large tandem gene clusters of functional importance. A total of 181 genes were identified in these newly assembled sequences, corresponding to the 161 gap regions in DM6.1. Of these 181 genes, three large clusters (>15 copies) of tandem duplicated genes were found, including 21 patatin genes (Figures 1B), 31 terpene synthase genes, and 18 cupin genes (Supplemental Figure 10). Among them, the 21 patatin genes showed much higher expression levels in tubers than in other organs of potato (Figure 1C). Intriguingly, patatin was found to be under absolute dosage selection, because it has been continuously expanded during the evolution, domestication, and breeding improvement of potato (Figures 1D–1E). In family Solanaceae, we found that patatin was only largely expanded in potato and a bit expanded in wolfberry (seven copies) while keeping three or fewer copies in others or was even completely lost in Physalis and tobacco (Figure 1E). Additionally, Etuberosum, which is a sister group of potato, has four and five copies of patatin in the two assembled Etuberosum genomes (Figure 1D). This indicates that expansion of patatin gene copies is associated with the speciation of potato, which may play an important role in the formation of enlarged tubers in potato. Furthermore, in the reported pan-genomes of tomato and potato (Tang et al., 2022Tang D. Jia Y. Zhang J. Li H. Cheng L. Wang P. Bao Z. Liu Z. Feng S. Zhu X. et al.Genome evolution and diversity of wild and cultivated potatoes.Nature. 2022; 606: 535-541Crossref PubMed Scopus (62) Google Scholar; Zhou et al., 2022Zhou Y. Zhang Z. Bao Z. Li H. Lyu Y. Zan Y. Wu Y. Cheng L. Fang Y. Wu K. et al.Graph pangenome captures missing heritability and empowers tomato breeding.Nature. 2022; 606: 527-534Crossref PubMed Scopus (60) Google Scholar), we found that the locus of patatin maintained only one or two gene copies in the tomato population but was expanded continuously and significantly in the potato population from the diploid wild potato, diploid S. candolleanum, to the diploid landraces of potato, with the average copy number growing from 5.9 and 7 to 14.6, respectively (Figure 1D), clearly indicating the expansion of patatin during the domestication of potato. Moreover, these expanded patatin genes were under strong positive selection (Ka/Ks > 1), especially in these domesticated potato genomes (Supplemental Figure 11), indicating the functional differentiation of patatin after gene copy expansion, which may associate with the development, production, and quality improvement of potato tubers. These findings together suggest that it is possible to breed potato cultivars of higher yields and quality through manipulating the absolute dosage, i.e., the gene copy number or the expression level, of patatin. There have been continuous efforts to improve the reference genome of DM, which is important for both scientific research and breeding programs of potato. In this study, we have generated the gap-free telomere-to-telomere genome assembly of DM8.1, which could serve as an important resource for future genomics and gene function studies in potato. This work was supported by the National Natural Science Foundation of China (32072119 and 31801421); the Breeding Program of Shandong Province, China (2020LZGC003); the National Agriculture Science and Technology Major Program, China (NK20220904); the China Agricultural Research System (CARS-9); the Central Public-interest Scientific Institution Basal Research Fund (Y2022PT23); and the Innovation Program of Chinese Academy of Agricultural Sciences (CAAS-ASTIP-IVFCAAS).
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
ATK20000完成签到 ,获得积分10
4秒前
11秒前
刘贤华完成签到 ,获得积分10
24秒前
artnon完成签到,获得积分10
25秒前
qyyhappy完成签到 ,获得积分10
37秒前
Ann完成签到,获得积分10
55秒前
EVEN完成签到 ,获得积分10
1分钟前
Singularity完成签到,获得积分0
1分钟前
深情安青应助TirionFecup采纳,获得10
1分钟前
gjww举报zhanghuanmiao求助涉嫌违规
2分钟前
2分钟前
2分钟前
TirionFecup发布了新的文献求助10
2分钟前
2分钟前
Rahul发布了新的文献求助10
2分钟前
gjww应助科研通管家采纳,获得10
2分钟前
Rahul完成签到,获得积分10
2分钟前
2分钟前
拓跋从阳发布了新的文献求助10
2分钟前
TirionFecup完成签到,获得积分10
3分钟前
哆啦A梦完成签到 ,获得积分10
3分钟前
4分钟前
5分钟前
小美酱完成签到 ,获得积分10
5分钟前
无情的匪完成签到 ,获得积分10
5分钟前
初七完成签到 ,获得积分10
5分钟前
拓跋从阳完成签到,获得积分10
6分钟前
6分钟前
6分钟前
6分钟前
6分钟前
6分钟前
6分钟前
6分钟前
从容芮应助有人采纳,获得30
7分钟前
人类繁殖学完成签到 ,获得积分10
7分钟前
有人给奥奥的求助进行了留言
7分钟前
席康完成签到 ,获得积分10
8分钟前
8分钟前
落寞醉易完成签到 ,获得积分10
8分钟前
高分求助中
请在求助之前详细阅读求助说明!!!! 20000
The Three Stars Each: The Astrolabes and Related Texts 900
Yuwu Song, Biographical Dictionary of the People's Republic of China 700
Bernd Ziesemer - Maos deutscher Topagent: Wie China die Bundesrepublik eroberte 500
A radiographic standard of reference for the growing knee 400
Glossary of Geology 400
Additive Manufacturing Design and Applications 320
热门求助领域 (近24小时)
化学 材料科学 医学 生物 有机化学 工程类 生物化学 纳米技术 物理 内科学 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 电极 光电子学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 2473431
求助须知:如何正确求助?哪些是违规求助? 2138773
关于积分的说明 5450805
捐赠科研通 1862817
什么是DOI,文献DOI怎么找? 926227
版权声明 562809
科研通“疑难数据库(出版商)”最低求助积分说明 495463