亲爱的研友该休息了!由于当前在线用户较少,发布求助请尽量完整的填写文献信息,科研通机器人24小时在线,伴您度过漫漫科研夜!身体可是革命的本钱,早点休息,好梦!

DSDecode: A Web-Based Tool for Decoding of Sequencing Chromatograms for Genotyping of Targeted Mutations

生物 基因分型 遗传学 解码方法 计算生物学 DNA测序 生物技术 DNA 基因型 基因 计算机科学 算法
作者
Weizhi Liu,Xianrong Xie,Xingliang Ma,Jun Li,Jiehu Chen,Yao-Guang Liu
出处
期刊:Molecular Plant [Elsevier]
卷期号:8 (9): 1431-1433 被引量:194
标识
DOI:10.1016/j.molp.2015.05.009
摘要

The transcription activator-like effector nucleases (TALENs) and clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 genome editing systems have greatly improved the efficiency for generating targeted mutations in various organisms including plants (Li et al., 2012Li T. Liu B. Spalding M.H. Weeks D.P. Yang B. High-efficiency TALEN-based gene editing produces disease-resistant rice.Nat. Biotechnol. 2012; 30: 390-392Crossref PubMed Scopus (752) Google Scholar, Cong et al., 2013Cong L. Ran F.A. Cox D. Lin S. Barretto R. Habib N. Hsu P.D. Wu X. Jiang W. Marraffini L.A. et al.Multiplex genome engineering using CRISPR/Cas systems.Science. 2013; 339: 819-823Crossref PubMed Scopus (9979) Google Scholar, Li et al., 2013Li J.F. Norville J.E. Aach J. McCormack M. Zhang D. Bush J. Church G.M. Sheen J. Multiplex and homologous recombination-mediated genome editing in Arabidopsis and Nicotiana benthamiana using guide RNA and Cas9.Nat. Biotechnol. 2013; 31: 688-691Crossref PubMed Scopus (993) Google Scholar, Feng et al., 2014Feng Z. Mao Y. Xu N. Zhang B. Wei P. Yang D.L. Wang Z. Zhang Z. Zheng R. Yang L. et al.Multigeneration analysis reveals the inheritance, specificity, and patterns of CRISPR/Cas-induced gene modifications in Arabidopsis.Proc. Natl. Acad. Sci. USA. 2014; 111: 4632-4637Crossref PubMed Scopus (501) Google Scholar, Ma et al., 2015bMa X. Zhang Q. Zhu Q. Liu W. Chen Y. Qiu R. Wang B. Yang Z. Li H. Lin Y. et al.A robust CRISPR/Cas9 system for convenient, high-efficiency multiplex genome editing in monocot and dicot plants.Mol. Plant. 2015; https://doi.org/10.1016/j.molp.2015.04.007Abstract Full Text Full Text PDF Scopus (1159) Google Scholar, Zhang et al., 2014Zhang H. Zhang J. Wei P. Zhang B. Gou F. Feng Z. Mao Y. Yang L. Zhang H. Xu N. et al.The CRISPR/Cas9 system produces specific and homozygous targeted gene editing in rice in one generation.Plant Biotechnol. J. 2014; 12: 797-807Crossref PubMed Scopus (574) Google Scholar, Zhang et al., 2015Zhang H. Gou F. Zhang J. Liu W. Li Q. Mao Y. Botella J.R. Zhu J.-K. TALEN-mediated targeted mutagenesis produces a large variety of heritable mutations in rice.Plant Biotechnol. J. 2015; https://doi.org/10.1111/pbi.12372Crossref Scopus (42) Google Scholar). In some plant species, the majority of mutations induced by TALENs and CRISPR/Cas9 systems are in uniform biallelic and heterozygous status in the first transgenic generation, although in some other plant species, chimeric mutations (with three or more allelic edited events within a single individual) may frequently occur (Li et al., 2013Li J.F. Norville J.E. Aach J. McCormack M. Zhang D. Bush J. Church G.M. Sheen J. Multiplex and homologous recombination-mediated genome editing in Arabidopsis and Nicotiana benthamiana using guide RNA and Cas9.Nat. Biotechnol. 2013; 31: 688-691Crossref PubMed Scopus (993) Google Scholar, Feng et al., 2014Feng Z. Mao Y. Xu N. Zhang B. Wei P. Yang D.L. Wang Z. Zhang Z. Zheng R. Yang L. et al.Multigeneration analysis reveals the inheritance, specificity, and patterns of CRISPR/Cas-induced gene modifications in Arabidopsis.Proc. Natl. Acad. Sci. USA. 2014; 111: 4632-4637Crossref PubMed Scopus (501) Google Scholar, Zhang et al., 2014Zhang H. Zhang J. Wei P. Zhang B. Gou F. Feng Z. Mao Y. Yang L. Zhang H. Xu N. et al.The CRISPR/Cas9 system produces specific and homozygous targeted gene editing in rice in one generation.Plant Biotechnol. J. 2014; 12: 797-807Crossref PubMed Scopus (574) Google Scholar, Zhang et al., 2015Zhang H. Gou F. Zhang J. Liu W. Li Q. Mao Y. Botella J.R. Zhu J.-K. TALEN-mediated targeted mutagenesis produces a large variety of heritable mutations in rice.Plant Biotechnol. J. 2015; https://doi.org/10.1111/pbi.12372Crossref Scopus (42) Google Scholar, Ma et al., 2015bMa X. Zhang Q. Zhu Q. Liu W. Chen Y. Qiu R. Wang B. Yang Z. Li H. Lin Y. et al.A robust CRISPR/Cas9 system for convenient, high-efficiency multiplex genome editing in monocot and dicot plants.Mol. Plant. 2015; https://doi.org/10.1016/j.molp.2015.04.007Abstract Full Text Full Text PDF Scopus (1159) Google Scholar). In many cases, it is necessary to determine the mutated sequences of the targeted alleles. However, direct sequencing (with the Sanger method) of PCR amplicons containing such biallelic or heterozygous mutations results in superimposed sequencing peaks starting from the mutation sites. Therefore, cloning of the mutation-containing amplicons and sequencing of multiple clones for each target editing site are required to determine the mutated sequences of the targeted alleles, which is tedious, time consuming, and expensive. Aimed at this problem, we have recently developed a highly reliable Degenerate Sequence Decoding (DSD) method (Ma et al., 2015aMa X. Chen L. Zhu Q. Liu Y. Rapid decoding of sequence-specific nuclease-induced heterozygous and biallelic mutations by direct sequencing of PCR products.Mol. Plant. 2015; https://doi.org/10.1016/j.molp.2015.02.012Abstract Full Text Full Text PDF Scopus (97) Google Scholar) and applied it to decode hundreds of targeted mutation events in rice and Arabidopsis (Ma et al., 2015bMa X. Zhang Q. Zhu Q. Liu W. Chen Y. Qiu R. Wang B. Yang Z. Li H. Lin Y. et al.A robust CRISPR/Cas9 system for convenient, high-efficiency multiplex genome editing in monocot and dicot plants.Mol. Plant. 2015; https://doi.org/10.1016/j.molp.2015.04.007Abstract Full Text Full Text PDF Scopus (1159) Google Scholar). The DSD method decodes superimposed sequencing chromatograms in the following steps: (1) starting from the first overlapping-peak position on the chromatogram, manually generate a short degenerate sequence (DS) that is adjacent to the anchor sequence (AS), which sits upstream of the first overlapping-peak; (2) query the DS against the intact reference sequence twice with a sequence analysis program to find the matched sequence(s); and (3) link the AS with the query-matched sequences to generate the allele sequences or, if detecting only one matching hit, generate the second allele sequence by subtracting the allele 1 nucleotides from the degenerate bases. Even though simple and highly efficient, manual operation of this DSD method is still time consuming when decoding a large number of superimposed sequencing chromatograms. Here, we describe a web-based tool, DSDecode, for automatic decoding of superimposed and regular sequencing chromatograms derived from PCR amplicons containing various types of mutations. Based on the DSD method, DSDecode can manipulate the whole decoding process mentioned above automatically, including reading and generating the AS and DS nucleotides from a corresponding sequencing chromatogram file. DSDecode is available from SCAU/SCGene (http://dsdecode.scgene.com/). The workflow of DSDecode is illustrated in Figure 1A. The first step in initiating a new task is to paste an intact reference sequence derived from sequencing of the non-edited parental line, which contains the target site and flanking sequences and completely covers the range of the sequencing chromatogram, into the text area. If the reference sequence is obtained from a public database of the species, this sequence must be identical to that of the parental line for genome editing. The second step is to upload a sequencing chromatogram file in .ab1 format for each decoding job. The parameters for the base numbers of AS (default 20 nt) and DS (default 15 nt) can be adjusted if necessary. The decoding process is started by clicking the “Decode” button (Figure 1B). Behind the scenes, the program carries out the following decoding steps automatically. (1) Read the sequencing digital information from the ab1 file and locate the boundary between regular single-peak signals and overlapping-peak signals. (2) Generate an AS and a DS from the sequencing information. (3) Query the DS against the reference sequence for the first hit. (4) Connect the AS and the first query-matched sequence to yield allele 1 sequence. (5) Subtract the nucleotides of allele 1 (in the degenerate part) from the degenerate bases; then join the resulting nucleotides with the AS to yield allele 2 sequence. (6) Align the decoded allele 1 and allele 2 sequences with the reference sequence. (7) Output the decoding results, including the AS and DS, decoded allele 1 and allele 2 sequences, and their alignments with the reference sequence (Figure 1C). In most cases the time for a decoding task, from data upload to display of the result, requires about 13–15 s. One unique ID is generated simultaneously for each decoding job and it can be used for reviewing the corresponding decoding results in the “QUERY” page within a week. A search with 13–15 bases of a DS usually produces enough uniqueness, while a query with longer DS may increase the risk of interference with the decoding from the high-level noise signals present within the DS region. However, in some instances, one mutated allele may possess a relatively large deletion that is longer than the primary set DS length; in such cases, some base mutations, if any, present at the other allele may not be detected because the mutation site is not completely covered by the primary shorter DS (Supplemental Figure 1A and 1B). To resolve this problem, the program is designed to recognize this type of mutation and automatically reset a longer DS (the bases of large deletion plus 5 nt, see Figure 1A), then run the decoding process again to produce a new result (Supplemental Figure 1C). If a decoding task fails, an inspection is displayed to direct the resolution of possible causes and links to a sub-program Semi-automatic DSDecode (see below). Sequencing chromatograms with low or intermediate noise signals (Supplemental Figure 2A and 2B) can be effectively decoded by DSDecode. However, high-level noise signals in some low-quality sequencing chromatograms (Supplemental Figure 2C) could interfere with the identification of AS and DS, leading to decoding failure. To decode such low-quality sequencing chromatograms, a sub-program page, Semi-automatic DSDecode (SaDSDecode), is available (Figure 1D), which can be visited directly via http://dsdecode.scgene.com/home/semidecode. By inputting the manually generated AS and DS from a sequencing chromatogram as described (Ma et al., 2015aMa X. Chen L. Zhu Q. Liu Y. Rapid decoding of sequence-specific nuclease-induced heterozygous and biallelic mutations by direct sequencing of PCR products.Mol. Plant. 2015; https://doi.org/10.1016/j.molp.2015.02.012Abstract Full Text Full Text PDF Scopus (97) Google Scholar) into the text area, decoding results can be obtained with the sub-program (Figure 1E). To obtain high-quality sequencing chromatograms for DSDecode-based automatic decoding, we recommend that targeted genomic sequences of about 400–700 bp, in which the targeted sites are located on the center regions, are PCR amplified and purified with a gel purification kit. Then the amplicons are sequenced with internal (nested) specific primers, with the binding positions preferably at 150–300 bp (no less than 100 bp and no more than 400 bp) upstream of the targeted sites. The primers used for the PCR amplification are not suitable for sequencing because they often produce high-level noise signals, which may interfere with the decoding by DSDecode. For multiplex targeting involving two or more closely arrayed sites, an amplicon containing these target sites can be obtained by a single PCR amplification, but independent sequencing for each target site (from both sides) is recommended, although fragmental deletions between two close target sites can be detected by DSDecode. DSDecode and SaDSDecode can genotype various types of biallelic and heterozygous mutations in diploid organisms as shown previously (Ma et al., 2015aMa X. Chen L. Zhu Q. Liu Y. Rapid decoding of sequence-specific nuclease-induced heterozygous and biallelic mutations by direct sequencing of PCR products.Mol. Plant. 2015; https://doi.org/10.1016/j.molp.2015.02.012Abstract Full Text Full Text PDF Scopus (97) Google Scholar). In addition, the simple types of biallelic and homozygous mutations with single-base insertions in both alleles and homozygous base deletions can also be analyzed with DSDecode (Supplemental Figure 3). However, chimeric mutations and the rare type, if any, of biallelic mutations with insertion or substitution of two or more bases in both alleles cannot be decoded by this program. To verify the efficiency and accuracy of DSDecode, we tested more than 150 sequencing chromatograms derived from various types of mutations in rice and Arabidopsis with DSDecode and further compared the results (Ma et al., 2015bMa X. Zhang Q. Zhu Q. Liu W. Chen Y. Qiu R. Wang B. Yang Z. Li H. Lin Y. et al.A robust CRISPR/Cas9 system for convenient, high-efficiency multiplex genome editing in monocot and dicot plants.Mol. Plant. 2015; https://doi.org/10.1016/j.molp.2015.04.007Abstract Full Text Full Text PDF Scopus (1159) Google Scholar) decoded with the manual DSD method (Ma et al., 2015aMa X. Chen L. Zhu Q. Liu Y. Rapid decoding of sequence-specific nuclease-induced heterozygous and biallelic mutations by direct sequencing of PCR products.Mol. Plant. 2015; https://doi.org/10.1016/j.molp.2015.02.012Abstract Full Text Full Text PDF Scopus (97) Google Scholar). The results showed that 125 chromatograms of high and ordinary quality were successfully and correctly decoded by DSDecode. The remaining chromatograms of low quality, while unable to be properly decoded with DSDecode, could be completely decoded by SaDSDecode or by the manual DSD method. The manual DSD method is also useful for verifying the results obtained with DSDecode or SaDSDecode. In summary, we present a versatile and user-friendly tool for genotyping of various types of uniform mutations by direct sequencing of PCR amplicons containing targeted sites. Although developed for genome targeting analysis, DSDecode can also be used for genotyping of other source-derived nucleotide variations at single sites of sequencing chromatograms. This tool, in combination with the TALEN and CRISPR/Cas9 technologies, will greatly facilitate basic and applied biological research. This work was supported by grants from the Ministry of Agriculture of the People's Republic of China (2014ZX08010-001, 2014ZX08009-002).
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
更新
大幅提高文件上传限制,最高150M (2024-4-1)

科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
Orange应助Xyy采纳,获得20
2秒前
小葵关注了科研通微信公众号
2秒前
领导范儿应助Maggie采纳,获得10
22秒前
1461完成签到 ,获得积分10
34秒前
范成发布了新的文献求助10
51秒前
范成完成签到,获得积分20
1分钟前
666完成签到 ,获得积分10
1分钟前
1分钟前
33发布了新的文献求助10
1分钟前
33完成签到,获得积分10
1分钟前
Maggie完成签到,获得积分10
2分钟前
drake完成签到,获得积分10
2分钟前
SSCI6688完成签到,获得积分10
2分钟前
星辰大海应助drake采纳,获得10
2分钟前
2分钟前
Maggie发布了新的文献求助10
2分钟前
3分钟前
drake发布了新的文献求助10
3分钟前
4分钟前
Wei发布了新的文献求助10
4分钟前
4分钟前
zyun完成签到 ,获得积分10
5分钟前
roy完成签到,获得积分10
6分钟前
6分钟前
Yogurt完成签到,获得积分10
6分钟前
Yogurt发布了新的文献求助10
6分钟前
8分钟前
yubin.cao发布了新的文献求助10
8分钟前
大模型应助yubin.cao采纳,获得10
8分钟前
HAHAHA完成签到 ,获得积分10
9分钟前
Spring完成签到,获得积分10
10分钟前
这个手刹不太灵完成签到 ,获得积分10
10分钟前
斯文败类应助科研通管家采纳,获得10
10分钟前
Eason完成签到,获得积分10
11分钟前
11分钟前
成就丸子完成签到 ,获得积分10
11分钟前
占万声完成签到,获得积分10
11分钟前
斓曦嘟噜完成签到 ,获得积分10
11分钟前
ppll3906发布了新的文献求助10
12分钟前
12分钟前
高分求助中
请在求助之前详细阅读求助说明!!!! 20000
The Three Stars Each: The Astrolabes and Related Texts 900
Yuwu Song, Biographical Dictionary of the People's Republic of China 700
Bernd Ziesemer - Maos deutscher Topagent: Wie China die Bundesrepublik eroberte 500
A radiographic standard of reference for the growing knee 400
Epilepsy: A Comprehensive Textbook 400
Glossary of Geology 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 有机化学 工程类 生物化学 纳米技术 物理 内科学 计算机科学 化学工程 复合材料 遗传学 基因 物理化学 催化作用 电极 光电子学 量子力学
热门帖子
关注 科研通微信公众号,转发送积分 2473044
求助须知:如何正确求助?哪些是违规求助? 2138758
关于积分的说明 5450755
捐赠科研通 1862775
什么是DOI,文献DOI怎么找? 926213
版权声明 562805
科研通“疑难数据库(出版商)”最低求助积分说明 495432