化学
甘草
假阳性悖论
模式识别(心理学)
数据集
人工智能
卷积神经网络
一般化
色谱法
真阳性率
集合(抽象数据类型)
分析化学(期刊)
数学
计算机科学
医学
数学分析
替代医学
病理
程序设计语言
作者
Jun Zeng,Hai Wu,Min He
标识
DOI:10.1016/j.aca.2022.340189
摘要
Peak detection of untargeted liquid chromatography-high resolution mass spectrometry (LC-HRMS) data is a key step to identify the metabolic status of the drugable chemicals and extracts from functional foods or herbs. Nevertheless, the existing approaches are difficult to obtain ideal results with low false positives and false negatives. In this paper, we proposed an automatic method based on convolutional neural network (CNN) for image classification and Faster R-CNN for peak location/classification in untargeted LC-HRMS data, and named it Peak_CF. It can achieve detection of target peaks with high accuracy and high recall (both >90%) as verified by an evaluation data-set. In terms of detecting the m/z peaks of known compounds, Peak_CF is better than Peakonly, and it can effectively have an overall peak shape judgment of split peaks. For the same evaluation data, the recall of MZmine2 (ADAP) is slightly higher than that of Peak_CF, however, the F1 score of Peak_CF is higher, indicating that it has higher accuracy. In addition, the Peak_ CF training model with strong generalization ability can be achieved and verified. At last, Peak_CF was applied in real metabolic fingerprints of total flavonoids from Glycyrrhiza uralensis Fisch, also a contrast was conducted based on 40 m/z peaks of 40 prototypes in serum data-set. The result showed that the recall rate of Peak_CF and Peakonly all reached 95%, higher than 70% of MZmine2 (ADAP), and Peak_CF is more accurate when detecting EIC that has serious drifts. In conclusion, Peak_CF provides a new route for data mining of LC-HRMS datasets of drug (or herbs, or functional foods) metabolites.
科研通智能强力驱动
Strongly Powered by AbleSci AI