计算机科学
人工智能
模式识别(心理学)
概率逻辑
再现性
特征(语言学)
瓶颈
合成数据
规范化(社会学)
提取器
校准
特征提取
数据挖掘
色谱法
可比性
直线(几何图形)
口译(哲学)
检出限
预处理器
作者
Daniel Walter,Mathias Helbig,Birgit Weydanz,Dominik Voltmer,Juán José Bonfiglio,C Marr,Tobias Großkopf
出处
期刊:ACS omega
[American Chemical Society]
日期:2026-05-27
卷期号:11 (22): 32946-32954
标识
DOI:10.1021/acsomega.6c01862
摘要
Reliable peak detection remains a bottleneck in size-exclusion chromatography (SEC) as overlapping signals, drifting baselines, and analyst variability limit reproducibility. As SEC is a routine release and comparability assay and its interpretation depends on peak morphology and context, machine learning methods are well-suited to improve reproducibility at scale. We present the Peak Feature Extractor 1 (PFE-1), a one-dimensional encoder-only transformer trained on millions of synthetic chromatograms generated by a simulator statistically calibrated to routine SEC data from antibodies and related large-molecule species. PFE-1 outputs probabilistic region and event predictions that are aggregated through a transparent rule-based procedure into interpretable peak boxes. We evaluate PFE-1 on synthetic benchmarks and on a curated real SEC benchmark, reporting window-level precision/recall/F1 and box-level agreement via an intensity-weighted box loss aligned with routine process annotations. Across these evaluations, PFE-1 outperforms convolutional and derivative-based baselines, with the largest gains observed under more challenging overlap and morphology conditions. On synthetic data, PFE-1 achieves substantially higher box-level agreement than both baselines; on the curated real SEC benchmark, it likewise achieves the strongest box-level agreement while requiring no sample-specific inputs (e.g., expected peak windows). We provide a reproducible and extensible SEC-specific framework for chromatographic peak detection that supports a more consistent peak interpretation in routine analytical workflows.
科研通智能强力驱动
Strongly Powered by AbleSci AI