定量蛋白质组学
计算机科学
缺少数据
概率逻辑
无标记量化
蛋白质组学
统计能力
插补(统计学)
统计模型
推论
杠杆(统计)
数据挖掘
统计
人工智能
机器学习
数学
化学
基因
生物化学
作者
Constantin Ahlmann-Eltze,Simon Anders
出处
期刊:Research Square - Research Square
日期:2020-06-23
被引量:25
标识
DOI:10.21203/rs.3.rs-36351/v1
摘要
Abstract Protein mass spectrometry with label-free quantification (LFQ) is widely used for quantitative proteomics studies. Nevertheless, well-principled statistical inference procedures are still lacking, and most practitioners adopt methods from transcriptomics. These, however, cannot properly treat the principal complication of label-free proteomics, namely many non-randomly missing values. We present proDA, a method to perform statistical tests for differential abundance of proteins. It models missing values in an intensity-dependent probabilistic manner. proDA is based on linear models and thus suitable for complex experimental designs, and boosts statistical power for small sample sizes by using variance moderation. We show that the currently widely used methods based on ad hoc imputation schemes can report excessive false positives, and that proDA not only overcomes this serious issue but also offers high sensitivity. Thus, proDA fills a crucial gap in the toolbox of quantitative proteomics.
科研通智能强力驱动
Strongly Powered by AbleSci AI