摘要:本文尝试利用机器学习的随机森林算法分析岩石主量元素与微量元素之间的关系。大洋玄武岩几乎不受混染作用影响,各元素成分之间的关系相对稳定,其中微量元素Zr是一种稳定的高场强元素。本文采用随机森林算法对洋岛玄武岩中Zr元素和主量元素的关系进行探索。利用算法中变量重要性度量判断各个主量元素与Zr元素之间关系的强弱,结合Pearson相关性分析,选择出5种主量元素(TiO2、CaO、MgO、Na2O 和P2O5)作为预测变量。由这5种元素确定了共1 000棵,每棵决策树含3个特征的随机森林模型,该模型对Zr元素的预测效果优于普通多变量回归方法。随后还探索了Zr元素与这5种主量元素的经验公式,得到的经验公式对Zr元素的预测结果也较好。本文更重要的意义是为拥有海量数据的地球化学领域引入一种用于数据挖掘的机器学习方法,并提出一套具有启发性的数据分析方案。?
关键词: 机器学习/
随机森林方法/
洋岛玄武岩/
微量元素/
主量元素
Abstract:We propose to investigate the relationships between major and trace elements by using the random forest algorithm of machine learning. The relationship between Zr and major elements in ocean island basalts(OIB)is selected as an example because:1)OIB are hardly affected by crust contamination and the relationship between the elements is relatively stable;2)the trace element Zr is a high field strength element and is chemically stable. Five major elements(TiO2, CaO, MgO, Na2O and P2O5)are selected as predictors based on the variable importance measurements of random forest method combining with the results of Pearson correlation analysis. A random forest model containing a thousand decision trees with three features in each tree is determined by these five elements, and the predictive result of this model to the Zr is superior to the general multivariable regression method. Furthermore, the empirical formula between Zr and these five major elements is analyzed and the fitted formula is also good for the prediction of Zr. By providing a demonstrating example, this study introduces a machine learning method to the research discipline of geochemistry which owns massive data and requires new techniques of data mining.?
PDF全文下载地址:
http://www.dzkx.org/data/article/export-pdf?id=geology_11492