PAH-Finder: A Pattern Recognition Workflow for Identification of PAHs and Their Derivatives

化学 工作流程 质谱法 环境化学 碎片(计算) 质谱 随机森林 数据库 色谱法 人工智能 计算机科学 操作系统
作者
Zixuan Zhang,Xin Xu,Shipei Xing,Changzhi Shi,Zecang You,Xiaojun Deng,Ling Tan,Zhe Mo,Mingliang Fang
出处
期刊:Analytical Chemistry [American Chemical Society]
标识
DOI:10.1021/acs.analchem.4c04249
摘要

Polycyclic aromatic hydrocarbons (PAHs) are pervasive environmental pollutants with significant health risks due to their carcinogenic, mutagenic, and teratogenic properties. Traditional methods for PAH identification, primarily relying on gas chromatography–mass spectrometry (GC–MS), utilize spectral library searches together with other techniques, such as mass defect analysis. However, these methods are limited by incomplete spectral libraries and a high false positive rate. Here, we present PAH-Finder, a data-driven workflow that integrates machine learning with high-resolution mass spectrometry (HRMS). PAH-Finder introduces a novel approach to evaluate the fragment distribution of PAH backbones in MS spectra by normalizing fragment m/z values to a 0–100% range relative to the molecular ion peak. Seven machine learning features capture PAH fragmentation characteristics, and a random forest model trained on 98 PAH spectra and 1003 background spectra achieved an F1 score of ∼0.9 in 5-fold cross validation. Additionally, PAH-Finder leverages the presence of doubly charged fragments and molecular formula prediction to enhance the identification accuracy. In a case study, PAH-Finder identified 135 PAHs, including 7 types of previously unreported PAH formulas in particulate matter samples, demonstrating a 246% increase in annotation efficiency compared to the NIST20 library search. It also identified 32 heteroatom-doped PAHs not included in the training data set, showcasing its robustness of generalization. PAH-Finder's high accuracy in detecting a broad spectrum of PAHs facilitates efficient data processing and interpretation for nontargeted analysis, enhancing our understanding of air pollution and public health protection. PAH-Finder is freely available at Github (https://github.com/FangLabNTU/PAH-Finder).
最长约 10秒,即可获得该文献文件

科研通智能强力驱动
Strongly Powered by AbleSci AI
科研通是完全免费的文献互助平台,具备全网最快的应助速度,最高的求助完成率。 对每一个文献求助,科研通都将尽心尽力,给求助人一个满意的交代。
实时播报
chendm完成签到,获得积分10
刚刚
科研通AI5应助科研通管家采纳,获得30
2秒前
科研通AI5应助科研通管家采纳,获得10
2秒前
映城应助科研通管家采纳,获得60
2秒前
2秒前
orixero应助科研通管家采纳,获得10
2秒前
FelixChen应助科研通管家采纳,获得10
2秒前
FelixChen应助科研通管家采纳,获得10
3秒前
hjyylab应助科研通管家采纳,获得10
3秒前
科研通AI5应助科研通管家采纳,获得10
3秒前
5秒前
善学以致用应助胖飞飞采纳,获得10
5秒前
7秒前
光亮毛豆发布了新的文献求助10
11秒前
丘比特应助泡芙采纳,获得10
16秒前
17秒前
有魅力的盼旋完成签到,获得积分10
17秒前
浮光完成签到,获得积分10
19秒前
无花果应助hang采纳,获得10
19秒前
米里迷路完成签到 ,获得积分10
20秒前
妙bu可yan完成签到,获得积分10
20秒前
20秒前
zokor完成签到 ,获得积分10
21秒前
情怀应助WL6采纳,获得10
22秒前
奋斗灵凡完成签到,获得积分10
23秒前
乒坛巨人完成签到 ,获得积分10
27秒前
Akim应助光亮毛豆采纳,获得10
27秒前
Minzy完成签到,获得积分10
28秒前
31秒前
hanzhipad完成签到,获得积分0
32秒前
孤独雨梅完成签到,获得积分10
33秒前
ff完成签到,获得积分10
33秒前
令狐凌波完成签到 ,获得积分10
35秒前
Wendy完成签到,获得积分10
36秒前
MHCL完成签到 ,获得积分10
37秒前
hang发布了新的文献求助10
38秒前
酷炫的春天完成签到,获得积分10
41秒前
44秒前
小可完成签到 ,获得积分10
45秒前
舒适的雁风完成签到,获得积分10
48秒前
高分求助中
Applied Survey Data Analysis (第三版, 2025) 800
Narcissistic Personality Disorder 700
Assessing and Diagnosing Young Children with Neurodevelopmental Disorders (2nd Edition) 700
Handbook of Experimental Social Psychology 500
The Martian climate revisited: atmosphere and environment of a desert planet 500
建国初期十七年翻译活动的实证研究. 建国初期十七年翻译活动的实证研究 400
Transnational East Asian Studies 400
热门求助领域 (近24小时)
化学 材料科学 医学 生物 工程类 有机化学 物理 生物化学 纳米技术 计算机科学 化学工程 内科学 复合材料 物理化学 电极 遗传学 量子力学 基因 冶金 催化作用
热门帖子
关注 科研通微信公众号,转发送积分 3845653
求助须知:如何正确求助?哪些是违规求助? 3387867
关于积分的说明 10550822
捐赠科研通 3108492
什么是DOI,文献DOI怎么找? 1712884
邀请新用户注册赠送积分活动 824546
科研通“疑难数据库(出版商)”最低求助积分说明 774877