生物
折叠变化
RNA序列
假阳性悖论
基因
计算生物学
核糖核酸
基因表达
基因表达谱
复制(统计)
错误发现率
遗传学
转录组
统计
数学
病毒学
作者
Nicholas J. Schurch,Pietà Schofield,Marek Gierliński,Christian Callebaut,Alexander Sherstnev,Vijender Singh,Nicola Wrobel,Karim Gharbi,Gordon G. Simpson,Tom Owen-Hughes,Mark Blaxter,Geoffrey J. Barton
出处
期刊:RNA
日期:2016-03-28
卷期号:22 (6): 839-851
被引量:581
标识
DOI:10.1261/rna.053959.115
摘要
An RNA-seq experiment with 48 biological replicates in each of 2 conditions was performed to determine the number of biological replicates ($n_r$) required, and to identify the most effective statistical analysis tools for identifying differential gene expression (DGE). When $n_r=3$, seven of the nine tools evaluated give true positive rates (TPR) of only 20 to 40 percent. For high fold-change genes ($|log_{2}(FC)|\gt2$) the TPR is $\gt85$ percent. Two tools performed poorly; over- or under-predicting the number of differentially expressed genes. Increasing replication gives a large increase in TPR when considering all DE genes but only a small increase for high fold-change genes. Achieving a TPR $\gt85$% across all fold-changes requires $n_r\gt20$. For future RNA-seq experiments these results suggest $n_r\gt6$, rising to $n_r\gt12$ when identifying DGE irrespective of fold-change is important. For $6 \lt n_r \lt 12$, superior TPR makes edgeR the leading tool tested. For $n_r \ge12$, minimizing false positives is more important and DESeq outperforms the other tools.
科研通智能强力驱动
Strongly Powered by AbleSci AI