1.School of Science, Changchun University of Science and Technology, Changchun 130022, China 2.School of Physics and Electronics, Shandong Normal University, Jinan 250358, China
Fund Project:Project supported by the National Natural Science Foundation of China (Grant No. 61575030), the Natural Science Foundation of Jilin province, China (Grant Nos. 20180101283JC, 20200301042RQ, 20180201033GX, 20190302125GGX), and the Research Foundation of Education Bureau of Jilin Province, China (Grant No. JJKH20190539KJ).
Received Date:11 September 2020
Accepted Date:19 October 2020
Available Online:06 February 2021
Published Online:20 February 2021
Abstract:Based on laser-induced breakdown spectroscopy and machine learning algorithms, ginseng origin identification model is established by principal component analysis algorithm combined with back-propagation (BP) neural network and support vector machine algorithm to analyze and identify ginseng from five different origins in northeast China (Daxinganling, Ji’an, Hengren, Shizhu, and Fusong). The experiment collects a total of 657 groups of laser-induced breakdown spectral data from five origins of ginseng at 200–975 nm, reduces the background continuous spectrum of the original spectral data by moving window smoothing method, labels the ginseng LIBS spectral elements according to the American NIST atomic spectral database. Eight characteristic spectral lines of 7 elements Mg, Ca, Fe, C, H, N and O are selected for principal component analysis according to characteristic spectral selection conditions. The cumulative contribution rate of the first three principal components of the original spectral data reaches 92.50%, which represents a large amount of information about the original ginseng LIBS spectrum, and the samples show a good aggregation and classification in the principal component space. After dimension reduction, the first three principal components are randomly selected in a ratio of 2 to 1 and divided into 438 test sets and 219 training sets, which are used as the input values of the classification algorithm. The experimental results show that the principal component analysis combined with the BP neural network algorithm and support vector machine algorithm can correctly identify 217 and 218 spectra of 219 spectra of the test set respectively, and the average recognition rate is 99.08% and 99.5% respectively. The modeling time of BP neural network is 11.545 s shorter than that of the support vector machine. Both models misjudged Ji'an Ginseng as Shi zhu ginseng, and the reason for this misjudgment is that the normalized intensity of H and O under Ca element ion emission spectrum are similar due to the proximity of Ji 'an to Shi Zhu in geographical environment. The study presented here demonstrates that laser-induced breakdown spectroscopy combined with machine learning algorithm is a useful technology for rapid identification of ginseng origin and is expected to realize automatic, real-time, rapid and reliable discrimination. Keywords:laser-induced breakdown spectroscopy/ machine learning algorithm/ identification of origin/ ginseng
由PCA分析出人参LIBS光谱中Mg, Ca, Fe, C, H, N, O共7个元素8条特征谱线对LIBS全谱的主成分贡献情况, 得到前10个主成分的贡献率和主成分的累计贡献率如图4(a)所示, PC1, PC2 和 PC3主成分累计贡献率为92.5%, 可认为PC1, PC2, PC3包含了原始人参LIBS光谱的大量信息. PC1, PC2 和 PC3 3个主成分向量组成的三维散点图如图4(b)所示. 图4中每个散点代表一个人参样本, 可以看出同产地人参样品的特征LIBS光谱经PCA处理后存在特定的聚集区域, 显示了良好的聚类效果. 结果表明结合PCA处理后的LIBS光谱数据能够表征人参的产地特征信息, 且能将不同产地人参间的差异进行有效区分. 由图4(b)可知, HR, FS和DXAL等产地人参的聚类性较好, 相互之间区分度高, JA和SZ产地人参样品也可聚在一起, 但存在部分重叠. 图 4 (a)各主成分贡献率和主成分累积贡献率; (b)前3个主成分的三维散点图 Figure4. (a) Contribution rate of each principal component and cumulative contribution rate of principal component; (b) three-dimensional scatter plot of first three principal components.
表2人参产地识别结果对比 Table2.Comparison of ginseng origin identification results.
人参的品质主要由人参皂苷及人参多糖的含量决定, 人参皂苷是固醇类化合物, 人参中皂苷和多糖主要由C, H, O等元素决定. 通过分析5个产地人参C I 247.8 nm, H I 656.39 nm, O I 777.42 nm元素在Ca II 394.2 nm元素谱线强度下的归一化强度结果如图7所示. 可以看出, JA和SZ两地人参在组成成分上虽因产地的不同导致金属元素的原子发射谱线强度存在差异, 但其H I 656.39 nm与O I 777.42 nm两条谱线强度的归一化强度几乎相同, 从而导致JA和SZ人参产地分类时发生误判. 图 7 人参LIBS谱中C, H, O元素谱线的归一化强度比 Figure7. Normalized intensity ratios of C, H and O element lines in the LIBS spectrum.