Spatial prediction of soil salinity in the Yellow River Delta based on geographically weighted regression
WUChunsheng1,2,, HUANGChong1,3,, LIUGaohuan1, LIUQingsheng1 1. State Key Lab. of Resources and Environment Information System,Institute of Geographic Sciences and Natural Resources Research,Chinese Academy of Sciences,Beijing 100101,China2. University of Chinese Academy of Sciences,Beijing 100049,China3. Key Laboratory of Ecosystem Network Observation and Modeling,Institute of Geographic Sciences and Natural Resources Research,Chinese Academy of Sciences,Beijing 100101,China 通讯作者:通讯作者:黄翀,E-mail:huangch@lreis.ac.cn 收稿日期:2015-10-19 修回日期:2016-01-8 网络出版日期:2016-04-25 版权声明:2016《资源科学》编辑部《资源科学》编辑部 基金资助:国家自然科学基金项目(41471335;41271407)国家科技支撑计划项目 (2013BAD05B03) 作者简介: -->作者简介:吴春生,男,山东菏泽人,博士生,主要研究生态GIS和遥感应用。E-mail:wuchsh0118@163.com
关键词:土壤含盐量;地理加权回归;环境变量;黄河三角洲 Abstract The content and spatial distribution of soil salinity is closely related to agriculture development and land productivity at a regional scale. It is essential to determine the content and spatial distribution of soil salinity in a timely manner as soil salinization could cause land degradation and influence human lives. Geographically weighted regression (GWR)is a local regression interpolation method that can achieve spatial extension of the dependent variable based on the relationships between the dependent variable and environmental variables and the spatial distances between sample points and predicted locations. GWR has been successfully applied to studies on some soil properties,such as soil organic matter. This study aimed to explore the feasibility of GWR in predicting soil salinity through comparisons with multiple linear regression (MLR)and Cokriging. Environmental factors,including the normalized difference vegetation index (NDVI),elevation and the distances of sample points from the rivers,were selected as auxiliary variables for GWR. The result generated by GWR showed a strong regularity in the spatial distribution of soil salinity,which has an increasing trend from coastal to inland areas,and the values of soil salinity near rivers were smaller than other regions. When compared to the map produced by Cokriging,GWR weakened the smoothing effect and many details became apparent. These findings indicate that GWR is applicable to predicting soil salinity. The prediction accuracy was higher than those of MLR and Cokriging. The RMSE,correlation coefficient,regression coefficient and adjust coefficient were 0.305,0.649,0.572 and 0.421,respectively. In addition,the prediction map generated by GWR reduced the smoothing effect compared to that of Cokriging and showed more spatial details than that of MLR.
研究区为现代黄河三角洲,经纬度范围为37°22 N-38°04 N,118°14 E-119°05 E,总面积约为5 072.38km2(图 1),东部和北部与渤海相邻。研究区地形平缓,高程在0~18.6m之间,受黄河改道和冲积作用,微地貌发育类型多样,包括洼地、缓斜平地、高岗坡堤等,使得盐分运移复杂,盐分区域分布差异较大。研究区降雨集中于6-9月,年均蒸发量远大于降水量,为盐分在浅层土壤的累积提供了便利。 显示原图|下载原图ZIP|生成PPT 图12003年黄河三角洲土壤盐分插值点与验证点分布 -->Figure 1The validation and calibration points locations in the Yellow River Delta in 2003 -->
土壤发生盐碱化必须具备三大因子:盐分来源、水分来源和使盐分向地表运移的机制[9]。依据该理论,结合数据可得性和实用性,本文选取6种环境变量。沿海地区盐分主要来自海水入侵[9,10],与海岸距离不同盐分含量差别较大,故以距海岸距离作为盐分来源的表征变量;地下水位高低影响盐分向土壤表面聚集[32],但整个研究区地下水位难以获取,故本文以样点距离河流和海岸距离,结合高程作为水分来源的表现变量;盐分向地表运移机制较为复杂,目前并未有统一结论和公式[9,32],针对这一因子选取环境要素较为困难,本文选用对土壤盐分具有一定影响或指示作用的环境要素作为补充,主要包括归一化植被指数(NDVI)、坡度和地形指数。 高程、坡度和地形指数利用ArcGIS的空间分析模块从研究区DEM中提取,DEM和河流信息来自中国科学院地理科学与资源研究所资源环境科学数据中心(http://www.resdc.cn/)。海岸线位置和NDVI基于USGS landsat 5影像数据获取,时相为2003年9月24日,空间分辨率为30m;NDVI利用TM3和TM4波段进行计算得出(NDVI=(band4- band 3)/( band 4+ band 3))。距河流距离和距海岸距离采用ArcGIS的欧氏距离模块计算获取,所有数据重采样为30m分辨率。
2.5 精度验证变量
将94个采样点分成70个插值点和24个验证点(图 1,表 1和表 2),本研究利用验证点实测值与预测值之间的拟合程度对各插值方法进行评估和比较;表示拟合程度的参数包括平均误差、均方根误差、相关系数、回归系数以及回归曲线的决定系数,平均误差(ME)和均方根误差(RMSE)公式为: (5) (6) Table 1 表 1 表 1研究区插值点盐分数据 Table 1The salt contents of calibration samples in study area (%)
ID
盐分
ID
盐分
ID
盐分
ID
盐分
ID
盐分
ID
盐分
ID
盐分
1
1.05
15
0.97
29
0.59
45
1.48
57
0.37
69
0.17
83
0.14
2
2.01
16
0.14
30
1.49
46
0.63
58
0.83
70
0.31
85
0.41
3
2.04
18
0.11
32
0.12
48
0.2
59
1.06
71
0.14
86
0.92
5
1.34
20
0.14
34
0.54
49
0.55
60
0.34
72
0.04
87
0.08
6
0.51
22
0.46
36
0.08
50
0.18
61
0.24
73
0.35
89
0.07
7
0.33
23
0.22
37
0.05
51
0.14
63
0.05
75
0.49
90
0.32
8
0.22
24
0.78
39
0.53
52
0.34
64
0.38
76
0.12
91
0.16
9
0.28
25
0.36
41
0.71
53
1.06
65
0.8
77
0.29
92
0.44
11
0.57
26
0.4
42
1.16
55
0.78
66
0.38
80
1.5
93
0.05
14
1.63
28
0.18
43
1.1
56
1.23
67
1.27
82
0.95
94
0.48
新窗口打开 Table 2 表 2 表 2研究区验证点盐分数据 Table 2The salt contents of validation samples in study area (%)
盐分、Cl-和环境变量统计分析如表 3,为便于本文统计分析,盐分和Cl-的单位均转换为百分比形式,土壤盐分含量在0.04%~2.04%之间,平均值为0.57%,变异系数为85.96%,盐分具有中等变异程度;Cl-和各环境变量也都存在变异性。 Table 3 表 3 表 32003年研究区盐分、Cl-和环境变量的基本统计 Table 3The descriptive statistic of soil salinity,Cl and environmental variables of the study area in 2003
最小值
最大值
平均值
标准差
变异系数
偏度
峰度
盐分/%
0.04
2.04
0.57
0.49
85.96
1.20
0.82
Cl-/%
0.02
1.07
0.29
0.27
93.02
1.23
0.74
海岸距离/m
1 129.29
43 592.30
17 350.36
10 072.84
58.06
0.55
-0.44
河流距离/m
60.00
14 597.90
3 258.98
3 209.02
98.47
1.82
3.33
NDVI
0.01
0.52
0.26
0.14
53.85
0.13
-1.12
坡度/°
0.00
2.25
0.09
0.27
34.27
7.54
60.43
高程/m
1.16
8.46
4.20
2.02
48.37
0.50
-0.72
地形指数
0.00
12.43
6.82
3.18
46.63
-0.86
0.60
新窗口打开 盐分与各环境变量之间的相关性分析如表 4,盐分与各环境变量均存在一定相关性,尤其是与距海岸距离、NDVI和高程均具有显著负相关。距离海岸越远,土壤受海水入侵危害越小,含盐量也越低;地势越高,土壤盐分随水分蒸发至土壤表面的阻力越大,盐分累积量则越小;NDVI值越高,植被生长状况越好,受盐分影响弱,也说明土壤盐分含量低,但也有例外,即沿海滩涂生长有大量耐盐植被翅碱蓬,也会显示很高的NDVI值,但当与高程相结合仍可作为辅助变量。由表 4中盐分与各变量间相关性可以获知,环境变量的空间变异势必会引起盐分含量在局部空间上变化。 Table 4 表 4 表 42003年研究区盐分与各环境变量的相关分析统计 Table 4The correlation coefficients for soil salinity and environment variables of study area in 2003
利用协同克里格时,主变量与协同变量之间相关性越好[33,34],插值结果精度越高,经统计分析后显示各离子中Cl-与盐分量相关性最大,将其作为盐分的协同变量。盐分和Cl-的k-s检验值分别为0.054和0.023,但表 3显示两者的偏度系数均大于1,均需要进行对数变换使其符合正态分布[15],变换后k-s检验值分别为0.90和0.75,满足克里格插值要求。分别对盐分、Cl-以及两者的交互变量进行半变异分析,获取各参数值和模拟模型。表 5显示三者的块金值均较小,表明离子自身随意误差引起的变异性不大,而偏基台值与基台值的比例(C/C0+C)均超过0.75,表明三者自身结构具有较强空间变异性[35,36];拟合的最优模型均为高斯模型,经验证在各方向上的半变异分析结果相同,即各向同性,插值时可采用统一模型和参数。按照表 5中参数对盐分进行协同克里格插值,结果如图 2。 Table 5 表 5 表 52003年研究区盐分、Cl-和两者交互变量的半方差模型参数 Table 5The semi-variogram parameters of soil salinity,Cl- and the concomitant variable of study area in 2003
块金值
偏基台
变程/m
Rss
R2
C/C0+C
模型
盐分
0.001
1.016
14 427.983
0.131
0.860
0.999
高斯
Cl-
0.001
1.149
14 029.612
0.109
0.891
0.999
高斯
交互变量
0.001
1.073
14 341.381
0.125
0.874
0.999
高斯
新窗口打开 显示原图|下载原图ZIP|生成PPT 图 22003年黄河三角洲协同克里格方法和MLR模型盐分插值结果对比 -->Figure 2The interpolation result maps generated by Cokriging and MLR in the Yellow River Delta in 2003 -->
3.3 环境变量筛选
表 4显示各环境变量之间也存在着不同程度相关性,若将所有变量用于回归模型,模型将受多重共线性影响,精度降低,造成结果不准确;故在建模前需对环境变量组合进行检验[25],本研究利用逐步回归方法筛选环境变量,获取最优环境变量组合,并引入容限值(tolerance)和方差膨胀因子(VIF)对各组中环境变量进行共线性检验。逐步回归结果如表 6。 Table 6 表 6 表 6逐步回归环境变量组合模型 Table 6The variable combinations generated with the stepwise regression model
变量
R
R2
Adjusted R2
标准估计误差
模型1
NDVI
0.499
0.249
0.237
0.429
模型2
NDVI,高程
0.542
0.294
0.273
0.419
模型3
NDVI,高程,河流距离
0.585
0.342
0.312
0.407
新窗口打开 逐步回归方法筛选出3种变量组合,其中模型3具有最高的R2和Adjusted R2,且标准估计误差最小,变量数最多,表明其对环境变量的利用更充分,模型模拟效果更好。对模型3中变量的共线性检验结果如表 7。 Table 7 表 7 表 7模型3共线性检验 Table 7The test results of multi-collinearity for model 3
以逐步回归筛选的环境变量和土壤盐分为基础,在ArcGIS中进行普通最小二乘分析,获取MLR模型各环境变量系数和模型模拟参数见表 8。MLR模型的截距、NDVI、高程和河流距离的系数分别为1.336 4、-1.549、-0.001和-0.354×10-4。模型公式为: (7) Table 8 表 8 表 8MLR 与 GWR模型模拟系数对比 Table 8The comparison of model simulation parameters between the MLR and GWR
以逐步回归筛选的环境变量和土壤盐分为基础,利用ArcGIS的地理加权模块进行分析,获取GWR模型中各变量系数分布(图3)和模型模拟参数(表8),可以看出所有环境变量系数在空间上各异,与MLR模型分析中全局采用统一系数形成对比。MLR模型和GWR模型参数对比看,GWR模型的Adjusted R2达到0.482,超过MLR模型,并且AICc(模型性能度量,越小越好)值小于MLR模型,证明GWR模型对观测值的拟合性能更好。利用ArcGIS的栅格计算模块对各变量系数和变量进行计算,获取GWR模型模拟结果,如图 4。 显示原图|下载原图ZIP|生成PPT 图3GWR模型中的环境变量系数分布 -->Figure 3The coefficients of environment variables generated by GWR -->
显示原图|下载原图ZIP|生成PPT 图42003年黄河三角洲GWR模型盐分插值结果 -->Figure 4The interpolation result maps generated by GWR in the Yellow River Delta in 2003 -->
3.6 精度对比
获取24个验证点的预测值,计算与实测值之间的平均误差、均方根误差、相关系数、回归系数和回归决定系数(表 9),GWR模型的相关系数和回归决定系数均大于MLR模型和协同克里格,均方根误差最小,表明GWR模型具有更高的预测精度。三者平均误差都较小,尤其协同克里格更接近于0,其预测无偏性更好;虽然协同克里格的回归系数较其他两者大,但其较低的相关系数和回归决定系数表明其回归曲线无法较好模拟预测值和实测值的关系,如图 5(见第711页)。从统计结果来看,GWR模型的精度最高,其次是MLR模型,协同克里格最差。三种方法获取的土壤盐分空间分布趋势相似,但协同克里格获取的结果较为平滑,细节不如其他两种突出,而GWR模型和MLR模型结果主要是在预测的盐分含量上的差别。 Table 9 表9 表9精度指标比较 Table 9The comparison of prediction accuracies for the Cokriging,MLR and GWR analyses
方法
平均 误差
均方根 误差
相关 系数
回归 系数
决定 系数
GWR模型
-0.029
0.305
0.649
0.572
0.421
MLR模型
-0.043
0.306
0.632
0.508
0.399
协同克里格方法
-0.001
0.444
0.506
0.652
0.256
新窗口打开 显示原图|下载原图ZIP|生成PPT 图52003年黄河三角洲盐分观测值与预测值模拟曲线 -->Figure 5The regression curves of observed and predicted values among Cokriging,MLR and GWR in 2003 -->
经实验研究可得结论,GWR模型用于土壤盐分含量预测是可行的,插值结果显示2003年黄河三角洲土壤盐分由内陆向沿海逐渐增加,河流两岸盐分明显低于其他地区,空间分布呈现一定规律性且具有合理性。与协同克里格法和MLR模型相比,插值结果精度更高,其均方根误差为0.305,明显小于协同克里格法的0.444,预测值与实测值的相关系数、回归系数以及回归决定系数分别为0.649、0.572和0.421,三者结合显示出比协同克里格法和MLR模型更高的插值精度;同时其结果图也展示出更为合理的盐分空间分布状况,既减弱了协同克里格法的平滑性,又比MLR模型具有更突出的局部细节,故GWR模型是一种较好的插值方法。但GWR模型仍存在一些问题,其在土壤属性甚至于其他环境领域的应用还需要有更深入的研究,如类别要素如何加入模型和多重共线性如何得到有效解决等。 The authors have declared that no competing interests exist.
Metternicht GI,Zinck JA.Remote sensing of soil salinity:Potentials and constraints [J]. ,2003,85(1):1-20. [本文引用: 1]
[2]
Wang YG,Xiao DN,LiY,et al.Soil salinity evolution and its relationship with dynamics of groundwater in the oasis of inland river basins:Case study from the Fubei region of Xinjiang Province,China [J]. ,2008,140(1-3):291-302. [本文引用: 1]
[3]
Reichman SM,Bellairs SM,Mulligan DR.The effects of temperature and salinity on Acacia harpophylla (brigalow)(Mimosaceae)germination [J]. ,2006,28(2):175-178. [本文引用: 1]
[4]
RewaldB,LeuschnerC,WiesmanZ,et al.Influence of salinity on root hydraulic properties of three olive varieties [J]. ,2011,145(1):12-22.
[5]
ShelefO,LazarovitchN,RewaldB,et al.Root halotropism:Salinity effects on Bassia indica root [J]. ,2010,144(2):471-478.
[6]
Xing JJ,CaiM,Chen SS,et al.Seed germination,plant growth and physiological responses of Salsola ikonnikovii to short-term Nacl stress [J]. ,2013,147(2):285-297. [本文引用: 1]
[7]
Lv ZZ,Liu GM,Yang JS,et al.Spatial variability of soil salinity in Bohai Sea coastal wetlands,China:Partition into four manage-ment zones [J]. ,2013,147(4):1201-1210. [本文引用: 3]
[8]
GhassemiF,Jakeman AJ,Nix HA,et al.Salinisation of Land and Water Resources:Human Causes,Extent,Management and Case Studies [M]. ,1995. [本文引用: 1]
[9]
FanX,PedroliB,LiuG,et al.Soil salinity development in the yellow river delta in relation to groundwater dynamics [J]. ,2012,23(2):175-189. [本文引用: 4]
[10]
LinY,HuangC,Liu GH,et al.Mapping soil salinity using a similarity-based prediction approach:A case study in Huanghe River Delta,China [J]. ,2015,25(3):283-294. [本文引用: 2]
[11]
BogunovicI,MesicM,ZgorelecZ,et al.Spatial variation of soil nutrients on sandy-loam soil [J]. ,2014,144(4):174-183. [本文引用: 3]
[12]
LiuR,Chen YC,Sun CC,et al.Uncertainty analysis of total phosphorus spatial-temporal variations in the Yangtze River Estuary using different interpolation methods [J]. ,2014,86(1-2):68-75.
[13]
ShahbeikS,AfzalP,MoarefvandP,et al.Comparison between ordinary kriging (OK)and inverse distance weighted (IDW)based on estimation error. Case study:Dardevey iron ore deposit,NE Iran [J]. ,2014,7(9):3693-3704. [本文引用: 1]
[14]
Dai FQ,Zhou QG,Lv ZQ,et al.Spatial prediction of soil organic matter content integrating artificial neural network and ordinary kriging in Tibetan Plateau [J]. ,2014,45(5):184-194. [本文引用: 1]
[15]
EmadiM,BaghernejadM.Comparison of spatial interpolation techniques for mapping soil pH and salinity in agricultural coastal areas,northern Iran [J]. ,2014,60(9):1315-1327. [本文引用: 2]
[16]
Li HY,ShiZ,WebsterR,et al.Mapping the three-dimensional variation of soil salinity in a rice-paddy soil [J]. ,2013, 195-196(1):31-41. [本文引用: 1]
[17]
Song GX,ZhangJ,WangK.Selection of optimal auxiliary soil nutrient variables for cokriging interpolation [J]. ,2014,9(6):e99695-e99695. [本文引用: 1]
[18]
HenglT, Heuvelink G B M, Rossiter D G . About regression-kriging:From equations to case studies [J]. ,2007,33(10):1301-1315. [本文引用: 1]
[19]
HughsonL,HuntleyD,RazackM.Cokriging limited trans-missivity data using widely sampled specific capacity from pump tests in an alluvial aquifer [J]. ,1996,34(1):12-18. [本文引用: 1]
[20]
Lark RM.Regression analysis with spatially autocorrelated error:Simulation studies and application to mapping of soil organic matter [J]. ,2000,14(3):247-264.
[21]
McKenzie NJ,Austin MP. A quantitative australian approach to medium and small-scale surveys based on soil stratigraphy and environmental correlation [J]. ,1993,57(4):329-355. [本文引用: 1]
[22]
Pelto CR,Elkins TA,Boyd HA.Automatic contouring of irregularly spaced data [J]. ,1968,33(3):424-424. [本文引用: 1]
Fotheringham AS,CharltonM,BrunsdonC.The geography of parameter space:An investigation of spatial non-stationarity [J]. ,1996,10(5):605-627. [本文引用: 1]
[25]
WangK,Zhang CR,Li WD,et al.Mapping soil organic matter with limited sample data using geographically weighted regression [J]. ,2014,59(1):91-106. [本文引用: 2]
[26]
MishraU,LalR,Liu DS,et al.Predicting the spatial variation of the soil organic carbon pool at a regional scale [J]. ,2010,74(3):906-914. [本文引用: 1]
[27]
MishraU,Riley WJ.Alaskan soil carbon stocks:Spatial variability and dependence on environmental factors [J]. ,2012,9(5):3637-3645.
[28]
Zhang CS,TangY,Xu XL,et al.Towards spatial geochemical modelling:Use of geographically weighted regression for mapping soil organic carbon contents in Ireland [J]. ,2011,26(7):1239-1248. [本文引用: 2]
[29]
WangK,Zhang CR,Li WD.Predictive mapping of soil total nitrogen at a regional scale:A comparison between geographically weighted regression and Cokriging [J]. ,2013,42(8):73-85. [本文引用: 2]
[Fan XM,Liu GH,Liu HG.Evaluating the Spatial Distribution of Soil Salinity in the Yellow River Delta based on kriging and Cokriging methods [J]. ,2014,36(2):321-327.] [本文引用: 1]
[WangH,Liu GH,GongP.Use of Cokriging to improve estimates of soil salt solute spatial distribution in the Yellow River Delta [J]. ,2005,60(3):511-518.] [本文引用: 1]
[35]
Cambardella CA,Moorman TB,Novak JM,et al.Field-scale variability of soil properties in central iowa soils [J]. ,1994,58(5):1501-1511. [本文引用: 1]
[36]
Chien YJ,Lee DY,Guo HY,et al.Geostatistical analysis of soil properties of mid-west Taiwan soils [J]. ,1997,162(4):291-298. [本文引用: 1]
[37]
GoovaertsP.Geostatistics in soil science:State-of-the-art and perspectives [J]. ,1999,89(1-2):1-45. [本文引用: 1]
[GuoL,ZhangH,ChenJ,et al.Comparison between co-kriging model and geographically weighted regression model in spatial prediction of soil attributes [J]. ,2012,49(5):1037-1042.] [本文引用: 1]