工作流程
可扩展性
计算机科学
计算生物学
深度测序
软件可移植性
数据挖掘
生物
数据库
基因组
遗传学
程序设计语言
基因
作者
Gisela Gabernet,Susanna Marquez,Robert Bjornson,Alexander Peltzer,Hailong Meng,E Aron,Noah Y. Lee,Cole G. Jensen,David Ladd,M Polster,Friederike Hanssen,Simon Heumos,Gur Yaari,Markus C. Kowarik,Sven Nahnsen,Steven H. Kleinstein
标识
DOI:10.1371/journal.pcbi.1012265
摘要
Adaptive Immune Receptor Repertoire sequencing (AIRR-seq) is a valuable experimental tool to study the immune state in health and following immune challenges such as infectious diseases, (auto)immune diseases, and cancer. Several tools have been developed to reconstruct B cell and T cell receptor sequences from AIRR-seq data and infer B and T cell clonal relationships. However, currently available tools offer limited parallelization across samples, scalability or portability to high-performance computing infrastructures. To address this need, we developed nf-core/airrflow, an end-to-end bulk and single-cell AIRR-seq processing workflow which integrates the Immcantation Framework following BCR and TCR sequencing data analysis best practices. The Immcantation Framework is a comprehensive toolset, which allows the processing of bulk and single-cell AIRR-seq data from raw read processing to clonal inference. nf-core/airrflow is written in Nextflow and is part of the nf-core project, which collects community contributed and curated Nextflow workflows for a wide variety of analysis tasks. We assessed the performance of nf-core/airrflow on simulated sequencing data with sequencing errors and show example results with real datasets. To demonstrate the applicability of nf-core/airrflow to the high-throughput processing of large AIRR-seq datasets, we validated and extended previously reported findings of convergent antibody responses to SARS-CoV-2 by analyzing 97 COVID-19 infected individuals and 99 healthy controls, including a mixture of bulk and single-cell sequencing datasets. Using this dataset, we extended the convergence findings to 20 additional subjects, highlighting the applicability of nf-core/airrflow to validate findings in small in-house cohorts with reanalysis of large publicly available AIRR datasets.
科研通智能强力驱动
Strongly Powered by AbleSci AI