代谢组学
校准
离群值
化学
欧几里德距离
数据挖掘
统计
计算机科学
色谱法
人工智能
数学
作者
Yanni Zhao,Zhiqiang Hao,Chunxia Zhao,Jieyu Zhao,Junjie Zhang,Yanli Li,Lili Li,Xin Huang,Xiaohui Lin,Zhongda Zeng,Xin Lu,Guowang Xu
标识
DOI:10.1021/acs.analchem.5b03912
摘要
Metabolomics is increasingly applied to discover and validate metabolite biomarkers and illuminate biological variations. Combination of multiple analytical batches in large-scale and long-term metabolomics is commonly utilized to generate robust metabolomics data, but gross and systematic errors are often observed. The appropriate calibration methods are required before statistical analyses. Here, we develop a novel correction strategy for large-scale and long-term metabolomics study, which could integrate metabolomics data from multiple batches and different instruments by calibrating gross and systematic errors. The gross error calibration method applied various statistical and fitting models of the feature ratios between two adjacent quality control (QC) samples to screen and calibrate outlier variables. Virtual QC of each sample was produced by a linear fitting model of the feature intensities between two neighboring QCs to obtain a correction factor and remove the systematic bias. The suggested method was applied to handle metabolic profiling data of 1197 plant samples in nine batches analyzed by two gas chromatography-mass spectrometry instruments. The method was evaluated by the relative standard deviations of all the detected peaks, the average Pearson correlation coefficients, and Euclidean distance of QCs and non-QC replicates. The results showed the established approach outperforms the commonly used internal standard correction and total intensity signal correction methods, it could be used to integrate the metabolomics data from multiple analytical batches and instruments, and it allows the frequency of QC to one injection of every 20 real samples. The suggested method makes a large amount of metabolomics analysis practicable.
科研通智能强力驱动
Strongly Powered by AbleSci AI