工作流程
计算生物学
可视化
仿形(计算机编程)
数据挖掘
RNA序列
原始数据
数据科学
计算机科学
转录组
生物
基因
遗传学
基因表达
操作系统
数据库
程序设计语言
标识
DOI:10.1038/s41581-020-0262-0
摘要
Breakthroughs in the development of high-throughput technologies for profiling transcriptomes at the single-cell level have helped biologists to understand the heterogeneity of cell populations, disease states and developmental lineages. However, these single-cell RNA sequencing (scRNA-seq) technologies generate an extraordinary amount of data, which creates analysis and interpretation challenges. Additionally, scRNA-seq datasets often contain technical sources of noise owing to incomplete RNA capture, PCR amplification biases and/or batch effects specific to the patient or sample. If not addressed, this technical noise can bias the analysis and interpretation of the data. In response to these challenges, a suite of computational tools has been developed to process, analyse and visualize scRNA-seq datasets. Although the specific steps of any given scRNA-seq analysis might differ depending on the biological questions being asked, a core workflow is used in most analyses. Typically, raw sequencing reads are processed into a gene expression matrix that is then normalized and scaled to remove technical noise. Next, cells are grouped according to similarities in their patterns of gene expression, which can be summarized in two or three dimensions for visualization on a scatterplot. These data can then be further analysed to provide an in-depth view of the cell types or developmental trajectories in the sample of interest. This Review provides the non-expert reader with an overview of the different steps involved in the analysis of single-cell RNA sequencing data. The authors also provide insight into the strengths and pitfalls of available analysis tools.
科研通智能强力驱动
Strongly Powered by AbleSci AI