生物
GenBank公司
原生生物
核糖体RNA
真核生物
系统发育学
遗传学
分类学(生物学)
分类等级
序列数据库
计算生物学
进化生物学
数据库
基因组
基因
生态学
分类单元
计算机科学
作者
Laure Guillou,Dipankar Bachar,Stéphane Audic,David Bass,Cédric Berney,Lucie Bittner,Christophe Boutte,Gaëtan Burgaud,Colomban de Vargas,Johan Decelle,Javier del Campo,John R. Dolan,Micah Dunthorn,Bente Edvardsen,Maria Holzmann,Wiebe H. C. F. Kooistra,Enrique Lara,Noan Le Bescot,Ramiro Logares,Frédéric Mahé
摘要
The interrogation of genetic markers in environmental meta-barcoding studies is currently seriously hindered by the lack of taxonomically curated reference data sets for the targeted genes. The Protist Ribosomal Reference database (PR2, http://ssu-rrna.org/) provides a unique access to eukaryotic small sub-unit (SSU) ribosomal RNA and DNA sequences, with curated taxonomy. The database mainly consists of nuclear-encoded protistan sequences. However, metazoans, land plants, macrosporic fungi and eukaryotic organelles (mitochondrion, plastid and others) are also included because they are useful for the analysis of high-troughput sequencing data sets. Introns and putative chimeric sequences have been also carefully checked. Taxonomic assignation of sequences consists of eight unique taxonomic fields. In total, 136 866 sequences are nuclear encoded, 45 708 (36 501 mitochondrial and 9657 chloroplastic) are from organelles, the remaining being putative chimeric sequences. The website allows the users to download sequences from the entire and partial databases (including representative sequences after clustering at a given level of similarity). Different web tools also allow searches by sequence similarity. The presence of both rRNA and rDNA sequences, taking into account introns (crucial for eukaryotic sequences), a normalized eight terms ranked-taxonomy and updates of new GenBank releases were made possible by a long-term collaboration between experts in taxonomy and computer scientists.
科研通智能强力驱动
Strongly Powered by AbleSci AI