子空间拓扑
人参
线性判别分析
训练集
分类器(UML)
人工智能
随机子空间法
模式识别(心理学)
试验装置
数学
近红外光谱
集成学习
计算机科学
机器学习
统计
生物
医学
病理
神经科学
替代医学
作者
Hui Chen,Chao Tan,Zan Lin
标识
DOI:10.1016/j.saa.2023.123315
摘要
Ginseng is a well-known traditional herbal medicine and the ginseng available on the market may not actually be produced in a certain place as claimed. Traditional methods of identifying the geographical origin of Ginseng are subjective, time-consuming or destructive. A more efficient approach is desirable. The feasibility of combining near-infrared (NIR) spectroscopy with ensemble learning for discriminating ginseng producing area was explored. A total of 270 samples were collected and evenly partitioned into the training and test sets. Random subspace ensemble (RSE) that uses linear discriminant classifier (LDA) as weak learner (abbreviated RSE-LDA) was used to construct predictive models. Two parameters including the size of subspace and the number of learners in ensemble were optimized. Classic partial least algorithm (PLS) was applied to build the reference model. The sensitivity, specificity, and total accuracy of final RSE-LDA and PLS models were 97.8 %, 100 %, 99.3 %, and 93.3 %, 96.7 %, 95.6 %, respectively. In order to study the impact of training set composition on the results, the samples were randomly divided 200 times and the algorithm was run repeatedly to statistically analyze the sensitivity and specificity on the test set. Similar results were obtained. The effect of training set size was also investigated. It indicates that the combination of NIR spectroscopy with the RSE algorithm is a potential tool of discriminating the origin of Ginseng.
科研通智能强力驱动
Strongly Powered by AbleSci AI