计算机科学
预处理器
修边
Python(编程语言)
适配器(计算)
Java
数据预处理
数据挖掘
源代码
程序设计语言
操作系统
作者
Shifu Chen,Yanqing Zhou,Yaru Chen,Jia Gu
出处
期刊:Bioinformatics
[Oxford University Press]
日期:2018-07-07
卷期号:34 (17): i884-i890
被引量:17237
标识
DOI:10.1093/bioinformatics/bty560
摘要
Quality control and preprocessing of FASTQ files are essential to providing clean data for downstream analysis. Traditionally, a different tool is used for each operation, such as quality control, adapter trimming and quality filtering. These tools are often insufficiently fast as most are developed using high-level programming languages (e.g. Python and Java) and provide limited multi-threading support. Reading and loading data multiple times also renders preprocessing slow and I/O inefficient.We developed fastp as an ultra-fast FASTQ preprocessor with useful quality control and data-filtering features. It can perform quality control, adapter trimming, quality filtering, per-read quality pruning and many other operations with a single scan of the FASTQ data. This tool is developed in C++ and has multi-threading support. Based on our evaluation, fastp is 2-5 times faster than other FASTQ preprocessing tools such as Trimmomatic or Cutadapt despite performing far more operations than similar tools.The open-source code and corresponding instructions are available at https://github.com/OpenGene/fastp.
科研通智能强力驱动
Strongly Powered by AbleSci AI