可解释性
Boosting(机器学习)
超参数
偏最小二乘回归
人工智能
机器学习
线性回归
计算机科学
梯度升压
回归
数学
线性模型
回归分析
预测建模
质量(理念)
模式识别(心理学)
超参数优化
I类和II类错误
化学计量学
非线性回归
特征选择
作者
Fangchen Ding,Miguel Ángel Rivero-Delgado,Rili Zha,Juan Francisco García Martín
出处
期刊:Food Control
[Elsevier BV]
日期:2025-11-26
卷期号:182: 111878-111878
被引量:2
标识
DOI:10.1016/j.foodcont.2025.111878
摘要
Near-infrared spectroscopy (NIRS) is a potential rapid and reagent-free technique for assessing the quality of fruit juices. However, most existing models focus on single juice type and rely on linear algorithms such as partial least squares regression (PLSR), which are often inadequate for handling the nonlinear and heterogeneous characteristics of diverse juice matrices. To address this challenge, this study developed boosting models optimized by Optuna, including XGBoost, AdaBoost, and CatBoost, to predict four key quality traits, namely acidity, total phenolic compounds (TPC), total flavonoid content (TFC), and vitamin C across 4 types of fruit juice. The boosting models consistently outperformed PLSR, particularly for acidity, TPC, and vitamin C, achieving R p 2 values above 0.95 and RPD values exceeding 4.93. SHAP-based interpretability analysis further revealed that, in addition to typical NIRS absorption bands such as 1163 nm, 1169 nm, and 1193 nm located within the 1150 to 1210 nm region, non-classical regions including 1104 nm and several wavelengths between 1264 and 1322 nm also contributed positively to the model outputs. This demonstrates the capacity of boosting algorithms to capture informative spectral features from non-classical regions that are often overlooked by traditional linear models. Overall, this study demonstrates the value of combining automated hyperparameter optimization with interpretable machine learning, offering a robust and scalable framework for high-throughput, non-invasive quality control in the juice industry by NIRS. • NIR spectral data from different types of fruit juice were analyzed. • Non-linear boosting models were developed to predict quality traits of juice. • Optuna was used for efficient hyperparameter optimization of the boosting models. • SHAP analysis revealed the contribution of specific spectral bands to model output.
科研通智能强力驱动
Strongly Powered by AbleSci AI