化学计量学
偏最小二乘回归
预处理器
数据预处理
计算机科学
硝酸
生物系统
析因实验
分析化学(期刊)
数学
数据挖掘
人工智能
化学
机器学习
色谱法
生物
无机化学
作者
Luke R. Sadergaski,T. Hager,Hunter B. Andrews
出处
期刊:ACS omega
[American Chemical Society]
日期:2022-02-15
卷期号:7 (8): 7287-7296
被引量:23
标识
DOI:10.1021/acsomega.1c07111
摘要
Selecting optimal combinations of preprocessing methods is a major holdup for chemometric analysis. The analyst decides which method(s) to apply to the data, frequently by highly subjective or inefficient means, such as user experience or trial and error. Here, we present a user-friendly method using optimal experimental designs for selecting preprocessing transformations. We applied this strategy to optimize partial least square regression (PLSR) analysis of Stokes Raman spectra to quantify hydroxylammonium (0-0.5 M), nitric acid (0-1 M), and total nitrate (0-1.5 M) concentrations. The best PLSR model chosen by a determinant (D)-optimal design comprising 26 samples (i.e., combinations of preprocessing methods) was compared with PLSR models built with no preprocessing, a user-selected preprocessing method (i.e., trial and error), and a user-defined design strategy (576 samples). The D-optimal selection strategy improved PLSR prediction performance by more than 50% compared with the raw data and reduced the number of combinations by more than 95.5%.
科研通智能强力驱动
Strongly Powered by AbleSci AI