代谢组学
水准点(测量)
软件
特征选择
计算机科学
选择(遗传算法)
数据挖掘
计算生物学
人工智能
鉴定(生物学)
机器学习
化学
色谱法
生物
植物
程序设计语言
地理
大地测量学
作者
Zhucui Li,Yan Liu,Yiting Guo,Haijie Cao,Qinhong Wang,Shui Wang
标识
DOI:10.1016/j.aca.2018.05.001
摘要
Data analysis represents a key challenge for untargeted metabolomics studies and it commonly requires extensive processing of more than thousands of metabolite peaks included in raw high-resolution MS data. Although a number of software packages have been developed to facilitate untargeted data processing, they have not been comprehensively scrutinized in the capability of feature detection, quantification and marker selection using a well-defined benchmark sample set. In this study, we acquired a benchmark dataset from standard mixtures consisting of 1100 compounds with specified concentration ratios including 130 compounds with significant variation of concentrations. Five software evaluated here (MS-Dial, MZmine 2, XCMS, MarkerView, and Compound Discoverer) showed similar performance in detection of true features derived from compounds in the mixtures. However, significant differences between untargeted metabolomics software were observed in relative quantification of true features in the benchmark dataset. MZmine 2 outperformed the other software in terms of quantification accuracy and it reported the most true discriminating markers together with the fewest false markers. Furthermore, we assessed selection of discriminating markers by different software using both the benchmark dataset and a real-case metabolomics dataset to propose combined usage of two software for increasing confidence of biomarker identification. Our findings from comprehensive evaluation of untargeted metabolomics software would help guide future improvements of these widely used bioinformatics tools and enable users to properly interpret their metabolomics results.
科研通智能强力驱动
Strongly Powered by AbleSci AI