1.School of Science, Changchun University of Science and Technology, Changchun 130022, China 2.School of Physics and Electronics, Shandong Normal University, Jinan 250358, China
Fund Project:Project supported by the National Natural Science Foundation of China (Grant No. 61575030), the Natural Science Foundation of Jilin province, China (Grant Nos. 20180101283JC, 20200301042RQ, 20180201033GX, 20190302125GGX), and the Research Foundation of Education Bureau of Jilin Province, China (Grant No. JJKH20190539KJ).
Received Date:11 September 2020
Accepted Date:19 October 2020
Available Online:06 February 2021
Published Online:20 February 2021
Abstract:Based on laser-induced breakdown spectroscopy and machine learning algorithms, ginseng origin identification model is established by principal component analysis algorithm combined with back-propagation (BP) neural network and support vector machine algorithm to analyze and identify ginseng from five different origins in northeast China (Daxinganling, Ji’an, Hengren, Shizhu, and Fusong). The experiment collects a total of 657 groups of laser-induced breakdown spectral data from five origins of ginseng at 200–975 nm, reduces the background continuous spectrum of the original spectral data by moving window smoothing method, labels the ginseng LIBS spectral elements according to the American NIST atomic spectral database. Eight characteristic spectral lines of 7 elements Mg, Ca, Fe, C, H, N and O are selected for principal component analysis according to characteristic spectral selection conditions. The cumulative contribution rate of the first three principal components of the original spectral data reaches 92.50%, which represents a large amount of information about the original ginseng LIBS spectrum, and the samples show a good aggregation and classification in the principal component space. After dimension reduction, the first three principal components are randomly selected in a ratio of 2 to 1 and divided into 438 test sets and 219 training sets, which are used as the input values of the classification algorithm. The experimental results show that the principal component analysis combined with the BP neural network algorithm and support vector machine algorithm can correctly identify 217 and 218 spectra of 219 spectra of the test set respectively, and the average recognition rate is 99.08% and 99.5% respectively. The modeling time of BP neural network is 11.545 s shorter than that of the support vector machine. Both models misjudged Ji'an Ginseng as Shi zhu ginseng, and the reason for this misjudgment is that the normalized intensity of H and O under Ca element ion emission spectrum are similar due to the proximity of Ji 'an to Shi Zhu in geographical environment. The study presented here demonstrates that laser-induced breakdown spectroscopy combined with machine learning algorithm is a useful technology for rapid identification of ginseng origin and is expected to realize automatic, real-time, rapid and reliable discrimination. Keywords:laser-induced breakdown spectroscopy/ machine learning algorithm/ identification of origin/ ginseng
全文HTML
--> --> --> -->
2.1.实验装置
激光诱导击穿光谱技术用于人参产地识别的实验装置如图1所示. 激光光源为输出波长1064 nm, 脉宽10 ns, 重复频率10 Hz的Nd:YAG激光器(Continuum, surellite II), 激光光束直径为6 mm, 激光光束通过由半波片和格兰棱镜组成的能量调节系统对诱导击穿人参等离子体的脉冲能量进行调控, 激光光束经焦距为120 mm的熔石英玻璃平凸透镜聚焦在人参样品表面诱导击穿产生等离子体. 激光光束聚焦焦点位于人参样品表面内0.8 mm, 目的为避免诱导击穿空气等离子体, 减少对人参光谱分析带来干扰. 在与人参等离子体膨胀轴向方向成45°的人参等离子体发射光谱方向上, 用焦距为75 mm的熔石英透镜收集耦合人参等离子体发射光谱耦合到配有ICCD探测器(1024 × 1024 pixel, DH334)的中阶梯光栅光谱仪(Andor, Me5000)的光纤探头, 光谱仪焦距为195 mm, 光谱分辨率为$\lambda /\Delta \lambda \approx {\rm{5000}}$, 一次光谱探测范围为200—975 nm. 激光器和ICCD探测器均由数字脉冲延时发生器(Standoff, DG645)同步触发工作, 通过优化激光脉冲与ICCD探测器间的时间延时和ICCD探测器的探测时间门宽, 设定延时和门宽分别为1和5 s, 获得高信背比的人参LIBS光谱信号. 为避免人参样品过度烧蚀, 人参样品固定在三维平移台上, 使每个激光脉冲作用在人参样品表面新的位置. 实验中人参LIBS光谱为100个脉冲进行平均, 降低脉冲能量抖动对人参LIBS光谱的稳定性影响. 实验均在标准大气压、室内温度为22 ℃、空气相对湿度为25%的条件下开展. 图 1 激光诱导击穿光谱实验装置示意图 Figure1. Schematic diagram of the experimental setup of LIBS.
由PCA分析出人参LIBS光谱中Mg, Ca, Fe, C, H, N, O共7个元素8条特征谱线对LIBS全谱的主成分贡献情况, 得到前10个主成分的贡献率和主成分的累计贡献率如图4(a)所示, PC1, PC2 和 PC3主成分累计贡献率为92.5%, 可认为PC1, PC2, PC3包含了原始人参LIBS光谱的大量信息. PC1, PC2 和 PC3 3个主成分向量组成的三维散点图如图4(b)所示. 图4中每个散点代表一个人参样本, 可以看出同产地人参样品的特征LIBS光谱经PCA处理后存在特定的聚集区域, 显示了良好的聚类效果. 结果表明结合PCA处理后的LIBS光谱数据能够表征人参的产地特征信息, 且能将不同产地人参间的差异进行有效区分. 由图4(b)可知, HR, FS和DXAL等产地人参的聚类性较好, 相互之间区分度高, JA和SZ产地人参样品也可聚在一起, 但存在部分重叠. 图 4 (a)各主成分贡献率和主成分累积贡献率; (b)前3个主成分的三维散点图 Figure4. (a) Contribution rate of each principal component and cumulative contribution rate of principal component; (b) three-dimensional scatter plot of first three principal components.
表2人参产地识别结果对比 Table2.Comparison of ginseng origin identification results.
人参的品质主要由人参皂苷及人参多糖的含量决定, 人参皂苷是固醇类化合物, 人参中皂苷和多糖主要由C, H, O等元素决定. 通过分析5个产地人参C I 247.8 nm, H I 656.39 nm, O I 777.42 nm元素在Ca II 394.2 nm元素谱线强度下的归一化强度结果如图7所示. 可以看出, JA和SZ两地人参在组成成分上虽因产地的不同导致金属元素的原子发射谱线强度存在差异, 但其H I 656.39 nm与O I 777.42 nm两条谱线强度的归一化强度几乎相同, 从而导致JA和SZ人参产地分类时发生误判. 图 7 人参LIBS谱中C, H, O元素谱线的归一化强度比 Figure7. Normalized intensity ratios of C, H and O element lines in the LIBS spectrum.