规范化(社会学)
计算机科学
生物导体
预处理器
数据预处理
数据挖掘
人工智能
生物
人类学
生物化学
基因
社会学
作者
Lis Arend,Klaudia Adamowicz,Johannes R. Schmidt,Yuliya Burankova,Olga Zolotareva,Olga Tsoy,Josch K. Pauling,Stefan Kalkhof,Jan Baumbach,Markus List,Tanja Laske
摘要
Abstract Despite the significant progress in accuracy and reliability in mass spectrometry technology, as well as the development of strategies based on isotopic labeling or internal standards in recent decades, systematic biases originating from non-biological factors remain a significant challenge in data analysis. In addition, the wide range of available normalization methods renders the choice of a suitable normalization method challenging. We systematically evaluated 17 normalization and 2 batch effect correction methods, originally developed for preprocessing DNA microarray data but widely applied in proteomics, on 6 publicly available spike-in and 3 label-free and tandem mass tag datasets. Opposed to state-of-the-art normalization practice, we found that a reduction in intragroup variation is not directly related to the effectiveness of the normalization methods. Furthermore, our results demonstrated that the methods RobNorm and Normics, specifically developed for proteomics data, in line with LoessF performed consistently well across the spike-in datasets, while EigenMS exhibited a high false-positive rate. Finally, based on experimental data, we show that normalization substantially impacts downstream analyses, and the impact is highly dataset-specific, emphasizing the importance of use-case-specific evaluations for novel proteomics datasets. For this, we developed the PROteomics Normalization Evaluator (PRONE), a unifying R package enabling comparative evaluation of normalization methods, including their impact on downstream analyses, while offering considerable flexibility, acknowledging the lack of universally accepted standards. PRONE is available on Bioconductor with a web application accessible at https://exbio.wzw.tum.de/prone/.
科研通智能强力驱动
Strongly Powered by AbleSci AI