HTML
--> --> --> -->2.1. Model setup
The WRF-Chem model is an online 3D, Eulerian chemical transport model that considers the complex physical and chemical processes in the troposphere (Grell et al., 2005). It has been applied in various research settings, especially those concerning feedbacks of air pollution to meteorological and chemical DA (Saide et al., 2012, 2015; Makar et al., 2015; Mizzi et al., 2016). In this study, version 3.7.1 of WRF-Chem was used to simulate the air quality in Hebei Province, China. Two nested domains were set, as shown in Fig. 1. The outer domain covered East Asia with a horizontal resolution of 75 km × 75 km and 106 (lon) × 81 (lat) grids, while the inner domain covered Hebei Province with a horizontal resolution of 15 km × 15 km and 76 (lon) × 81 (lat) grids. The vertical resolution of the model was 24 vertical levels (about 6 levels below 1 km and 20 levels below 10 km), with 100 hPa as the model top. The 0.5°× 0.5° data from the NCEP's Global Forecast System were used to provide the meteorological initial conditions and lateral boundary meteorological conditions every 12 hours. Atmospheric gaseous chemistry and aerosols were simulated using the Second Generation Regional Acid Deposition Model (RADM2) with the Modal Aerosol Dynamics Model for Europe (MADE) and the Secondary Organic Aerosol Model (SORGAM) (Stockwell et al., 1990; Ackermann et al., 1998; Schell et al., 2001) scheme. The anthropogenic emissions inventory was the Multi-resolution Emissions Inventory for China in 2012 (http://www.meicmodel.org). The configuration of WRF-Chem is detailed in Table 1.2
2.2. MOS method
What MOS does is to find statistic relationships from training samples that can then be applied in model forecast outputs. By doing so, the expectation is that model errors will be corrected and a forecast generated that will better fit the observation. In this study, a one-dimensional Kalman filter was chosen as the algorithm to realize the MOS process. The algorithm was formulated in a way that generally resembled G02; the only modifications were as follows:Firstly, in this work, the measurement y(t), as well as the real value x(t), could both be the difference and ratio between the forecast and observation, whereas in Eqs. (1) and (2) of G02 they only denoted the difference. Furthermore, hourly concentrations of five species from the three-day model output were split into 3× 5× 24 independent daily concentration series. Lastly, given Kalman filtering can only predict one time-step ahead (one day ahead, in this context), the correction could only work during forecast hours 0-24 (24-h forecast hereafter), while leaving forecast hours 24-48 and 48-72 (48-h and 72-h forecast hereafter) uncorrected. Therefore, to extend the algorithm further, the corrected results from the 24-h (48-h) forecast was used as a proxy or substitute observation at the corresponding time to correct the 48-h (72-h) model output. Appendix A describes the steps in more detail.
2
2.3. DA configuration
In this study, 3DVar DA was implemented to optimize the CICs for the inner model domain. The DA system and formulation used were based on L13 with the following modifications:In addition to the fine particulate matter (PM2.5) assimilated in L13, particulate matter with diameters between 2.5 μm and 10 μm (PM2.5-PM10) was also assimilated and the analysis increment was added to the corresponding model variables following L13. Gaseous species, including SO2, NO2 and O3, were also assimilated to decrease the uncertainty of their concentrations in the model CICs.
Following L13, the National Meteorological Center (NMC) method (Parrish and Derber, 1992) was adopted to estimate the background root-mean-square error (RMSE) and the three Kronecker product members of the background error correlation matrix. The NMC method utilized the difference between the 12- and 24-h WRF-Chem forecasts valid at the same time of 1200 UTC for a whole month. No cross correlation between different species was assigned for background error. The domain-average RMSEs for five species are shown in Fig. 2. The vertical distributions of RMSE for all species display a relatively rapid decrease with height——except for O3, which peaks at around 4 km above the ground. The vertical correlation matrices are displayed in Fig. 3. The correlation between different height levels experiences a jump at the top of the boundary layer, which seems to be a common feature for all species. In addition, the band of high correlation along the diagonal seems wider in the middle troposphere than the upper or lower troposphere.
Considering that all the stations were built and maintained under the same standards, no difference in the measurement and representativeness error between different stations was assumed. In addition, cross correlation between different species and stations was set to zero because of the lack of information. Therefore, observation error consisted merely of five values——one for each species. The measurement error was assigned as 1.0 μg m-3, 1.0 μg m-3, 1.0 ppb, 1.0 ppb and 1.0 ppb for PM2.5, PM2.5-PM10, SO2, NO2 and O3, respectively. Representativeness error was estimated following (Elbern et al., 2007) and (Schwartz et al., 2012) using the formula \begin{equation} \varepsilon_{\rm r}=\gamma\varepsilon_0\sqrt{\dfrac{\Delta x}{L}} , \ \ (1)\end{equation} where ε0 and ε r are the measurement error and representativeness error, γ is an adjustable parameter that accounts for the lifetime of the species (0.5 for PM2.5, PM2.5-PM10 and O3; 1 for SO2 and 2 for NO2), ? x is the grid spacing (here, 15 km), and L is the radius of influence determined according to the location of stations (here, 4.0 km for suburban stations——assumed for all sites). If the total observation error is defined as the sum of measurement error and representativeness error, the standard deviation of observation error is 2.0 \(\mu\rm g\;m^{-3}\), 2.0 μg m-3, 3.0 ppb, 4.9 ppb and 2.0 ppb for PM2.5, PM2.5-PM10, SO2, NO2 and O3, respectively. Though the observation error was determined fairly arbitrarily and empirically here, the uncertainty relating to it should not have a significant influence on the conclusion. That is because the results are usually not very sensitive to the specification of error (Geer et al., 2006), and similar analysis fields were obtained from our experiments when the observation error was increased or reduced by a factor of two or three.
Figure2. Vertical distribution of the root-mean-square of the background errors, in mass concentration for particulate matter and ppb for gaseous species.
Figure3. Vertical correlations of the background errors for PM2.5, SO2, NO2, and O3. The plot for PM2.5-PM10 is very similar to that of PM2.5 and so is not shown here. Both the x- and y-axis are limited within 10 km, though the real model top is higher.
2
2.3. Experimental design
To compare the relative importance of MOS and DA, four parallel experiments were designed: Sim_base, Sim_DA, Sim_MOS and Sim_DM. Sim_base worked as the base simulation without applying DA or MOS; Sim_DA was an experiment with only DA employed to optimize the model CICs; Sim_MOS was the same as Sim_base but with the model output corrected by one-dimensional Kalman filtering; and Sim_DM used both the DA and MOS methods.To simulate the operational forecast scenes, as Fig. 4 shows, all experiments initiated a new WRF-Chem forecast at 1200 UTC, every day, between 30 November 2014 and 31 December 2014. Each forecast was integrated for 84 h to generate 72-h forecasts for each day, with the earliest 12 h discarded as spin-up time. The CICs for each initiation came from the 24-h forecasts of the previous cycle, which would be the background fields to be assimilated with valid observations for experiments with DA before initializing WRF-Chem. The first CICs at 1200 UTC 30 November 2014 came from a two-days' spin-up of the climatological background chemistry profile. For all experiments, the chemical boundary conditions came from the default climatological chemistry profile for the outer domain and the interpolation of the outer domain for the inner domain.
Figure4. Time settings of the model in the four experiments. In each forecast cycle, the model was integrated 84 h in advance, with the first 12 h discarded (black dashed arrow) and the remaining 72 h divided into three parts: 24-h forecast (black solid arrow), 48-h forecast (blue solid arrow) and 72-h forecast (red solid arrow). The blue thick arrows mean using CICs directly for experiments without DA, while for others as the background fields of the 3DVar DA.
2
2.4. Observational data
Hourly concentrations of SO2, NO2, PM10, O3 and PM2.5 at surface level from 207 sites were provided by the Ministry of Environmental Protection of China. Data covered the whole month of December 2014 and had been subjected to routine quality control. As shown in Fig. 5, only 155 stations were selected (randomly) from the 207 stations to be assimilated, and the data of the remaining 52 were used to verify the assimilation process. Because all stations were located at surface level, the adjustment of the CICs from the 3DVar DA was limited within several layers near the surface according to the vertical background error covariance. Furthermore, it should be noted that only those 155 sites that provided their data for the 3DVar DA participated in the MOS process.Figure5. The terrain height of the Hebei Province, with monitoring sites plotted as filled dots. Red dots are sites that participated in the 3DVar and MOS process, while the blue ones are those used only in the validation of the DA effect.
-->
3.1. Model evaluation
Table 2 presents the mean bias (MB), relative bias (RB), RMSE and correlation coefficient (Corr) for the 24-h, 48-h and 72-h forecast of Sim_base. In general, the base model simulation provides a fairly good result——especially for NO2, whose bias is small and correlation high. In terms of particulate matter and SO2, the model tends to systematically underestimate the concentration of SO2 as well as that of PM2.5 and PM10. Even so, the model reproduces the temporal variations of particulate matter well, with Corr values higher than 0.47 for PM10 and 0.54 for PM2.5. For O3, the model encounters a problem——the surface O3 simulated concentrations (5-45 μg m-3 from observation) are seriously overestimated by the model (20-80 μg m-3 from simulation), which leads to positive bias (~ 40 μg m-3) and lower Corr (0.44) than for other species. Fortunately, when viewing the RMSE of O3 from the aspect of MB, it is apparent that MB contributes the largest portion of RMSE, and therefore the model is still able to reproduce the variation of O3. The biases mentioned above can usually be attributed to the uncertainties from the emissions inventory, meteorological forecasting (Tang et al., 2011) and model schemes (Yerramilli et al., 2010). Although the 24-h forecast performs the best for all species, the 48-h and 72-h forecasts are also good enough to yield fairly reliable results, which is critical to the success of MOS in the whole 72 hours' forecast. In short, the model shows forecast skill that is sufficient to be competent for the success of the DA and MOS process.2
3.2. Validation of MOS
Figure 6 depicts the site-averaged hourly concentration simulated by Sim_MOS, plotted against ground observations. Note that, although the hourly concentrations were averaged over 155 stations, the MB and RMSE values in Table 3 were generated by first calculating the individual errors of the 155 stations, before averaging.From Fig. 6 it can be concluded that the forecast from Sim_MOS fits the observation fairly closely——especially for SO2, NO2 and O3. However, when it comes to PM2.5 and PM10, the points locate within a wider space, and those extremely high observations are hard for MOS to forecast. Even so, when comparing Table 3 with Table 2, PM2.5 and PM10, together with the other three species, demonstrate that a clear correction can be obtained for all forecast times. Excluding the 48-h forecast of NO2, MOS can reduce the MB to a large extent, meaning this method can remove the majority of the model systematic bias. Because of the reduction in MB, the RMSE also decreases for all cases except the 24-h forecast of SO2. In addition, the effect on reducing the error is unlikely to become poorer as the forecast time advances. For the 72-h and 48-h forecasts, the effect MOS has on the former may rival or even exceed that on the latter, e.g., the RMSE reduction of PM2.5 is even larger for the 72-h than 48-h forecast. For the 48-h and 24-h forecasts, the same effect can be found. Among all five species, O3 seems to benefit the most from the MOS process. This is because O3 usually follows a very regular daily variation, which makes the hourly-split but daily-linked concentration series almost perfect for the assumptions of one-dimensional Kalman filtering.
Admittedly, MOS degrades the forecast in a few cases (e.g., the 48-h forecast of NO2 and 24-h forecast of SO2, as mentioned above). Such increases in error, however, will usually not be of concern to users, and may well be accepted, as they are extremely small and only appear at times when the model outputs to be corrected are already fairly close to the observation. Nonetheless, when viewed from the correlation perspective, such degradation becomes more obvious. Except for NO2 and O3, the correlations all experience a reduction by 0.1-0.2. Thus, the MOS approach tends to reduce the bias and error at the expense of correlation.
Figure6. Hourly concentrations simulated by Sim_MOS, plotted against ground station observations (obs) averaged over 155 stations.
2
3.3. Validation of DA
Figures 7 and 8 show the change in RMSE and Corr over the integration time from -12 h (right after the DA) to 10 h (already integrated for 22 h), respectively, for experiments Sim_base and Sim_DA. To keep the verification independent from the observations assimilated, the RMSE and Corr were only averaged over the 52 stations that did not provide their observational data in the 3DVar DA process.From the -12 h forecast of Fig. 7 and Fig. 8, DA leads to better initial conditions for the simulation——especially for NO2, PM10, PM2.5 and O3, whose RMSEs decrease substantially at almost all sites. For example, PM10 and PM2.5 shown an RMSE reduction of about 50-100 μg m-3, which is about half the RMSE of Sim_base. Such results are as good as those obtained by L13 and (Jiang et al., 2013), who also worked on assimilating ground observations using 3DVar. For SO2, the reduction in RMSE is less apparent (although the change in RMSE is still negative when 52 sites are averaged), but the correlation after DA is still obviously larger than before. The marginal RMSE reduction for SO2 may be acceptable considering the increase in correlation is rather obvious and the data representativeness of some stations is dubious (Zhang et al., 2016).
Figure7. RMSE change over the integration time from the -12 h forecast (right after the DA) to the 10 h forecast (already integrated for 22 h) for the five species. All RMSE values were averaged over the 52 stations that did not provide observations in the DA.
Figure8. Similar to Fig. 7 but for Corr.
However, as expected, the effect of DA slowly diminishes as the integration continues, which has also been observed in other works (Jiang et al., 2013; Li et al., 2013). After the model has been integrated for more than 14 h, the RMSE after DA minus that before DA (RMSE change henceforth) is still negative, but their absolute values are apparently smaller when compared to the earlier ones. The effect of DA remains for longer in the case of O3, PM10 and PM2.5 (RMSE change remains negative for the 14 h), benefiting mainly from their relatively long lifetimes. For example, PM2.5 and PM10 still maintain an RMSE reduction of about 10-20 μg m-3, which is even better than the results reported by L13 and (Jiang et al., 2013). However, for NO2, whose lifetime is short, the two experiments show almost no difference in RMSE after four hours' integration. Because the initial improvement from DA is relatively small, the forecast of SO2 soon loses its improvement from DA and shows little RMSE change almost immediately after the run of the model. For SO2 and NO2, the RMSE change is positive in some cases, but the increase in RMSE is usually very small compared with the original RMSE, and therefore unlikely to be a problem. Conclusions from the viewpoint of correlation are similar to those from the RMSE, except the effect of DA seems more obvious and long-lasting.
Overall, for most cases, DA successfully produces better CICs for the model and may help to improve the forecast skill in the following half to one day.
2
3.4. Effect of MOS and DA
From Fig. 9, which plots the RMSE of the four experiments and five species at different forecast hours from -12 h to 72 h, we can see that the forecast error varies with forecast hours. Considering the RMSE was calculated from the statistics of one month, Fig. 9 is a good representation of the general features of the four experiments.By comparing the experiments using MOS (solid lines) with those without MOS (dotted lines), it is clear that all species show a large reduction in the 72 hours forecast span, and that this reduction is much larger than can be provided by DA (solid lines are below the dotted lines by a larger distance than the blue lines are below the red). For example, when compared to Sim_base, the average SO2 RMSE decreases by 4.34 μg m-3 for Sim_MOS (average taken over all stations throughout the whole 72-h forecast), while the decrease is only 0.48 μg m-3 for Sim_DA. Worse still, when the forecast runs to its second or third day, the effect of DA inevitably diminishes (as evidenced by the overlapping red and blue dotted lines after 24 h), while MOS can still work during this period (solid lines do not overlap the dotted lines, even after 24 h).
The blue dotted lines represent the simulation RMSE corrected only with MOS, while the red dotted lines are the results processed by both MOS and DA. Overlapping of the two lines can be seen at almost all times from all species, meaning the DA of the initial conditions provides little help to the effect of MOS, although does provide better initial conditions for the model to generate a better forecast. However, sometimes the two lines do not overlap, and show some differences, which is common for all species but most obvious for NO2 and most unobvious for PM10 and PM2.5. When DA can still improve the forecast, or the red dotted line is below the blue dotted line, the red solid line could be either above (1-h forecast for SO2) or below (10-h forecast for O3) the blue line, which means a better forecast from DA may either improve or degrade the MOS effect. Because in this work MOS corrects one day's forecast using the correction results and forecast from previous days, it is not a surprise that Sim_DM and Sim_MOS show discrepancies when Sim_DA and Sim_base coincide after 24 h. However, as mentioned, such a discrepancy could be either an improvement or degradation.
Figure9. RMSE of the four experiments and five species at different forecast hours from -12 h to 72 h.
2
3.5. Discussion
This section provides additional discussion around two facts. First, that MOS may improve forecasts far more than the 3DVar DA of CICs. This is reasonable because MOS can remain effective throughout the whole forecast period, whereas the effect of 3DVar via optimized initial conditions usually diminishes after 24 h of model integration. The loss of benefit from 3DVar DA is unavoidable because atmospheric chemistry is less sensitive to CICs than other driving factors like meteorological conditions and emissions (Henze et al., 2009; Semane et al., 2009; Tang et al., 2011). Worse still, the forecast during the earliest 12 h, which benefits the most from 3DVar DA, usually makes no sense in a real operational forecasting environment and is excluded from evaluation as spin-up. In fact, when compared with Sim_base, Sim_DA can account for 43.85% of the O3 RMSE reduction in the first 12 h after initialization (forecast hour -12 to -1), closely rivalling the contribution of Sim_MOS (55.94%) in its first 12 h (forecast hour 0 to 11). However, when discussed within the same forecast period, e.g., 12 h to 24 h, Sim_DA can produce only 3.93% of the O3 RMSE reduction, which is far less than Sim_MOS (61.26%), despite the following hours during which DA has no effect at all.The second fact to be explained is that Sim_DM does not always outperform Sim_MOS. This result is somehow against the experience that, when corrected with the same MOS algorithm, better input should lead to better or at least not worse output. However, for example, assume the difference between the forecast and observation remains constant before 3DVar is applied. (This condition may never happen in reality, but this does not matter for the purposes of demonstration). Then, no matter how large the constant is, the MOS method might work perfectly to eliminate it. However, after DA is applied, it is possible that the difference may reduce though no longer remain constant. In this case, although the forecast becomes better before MOS, it is now more difficult for MOS to eliminate the error. So, it should be noted that the error's temporal consistency, rather than its magnitude, determines the effects of MOS on the model outputs. When reducing the magnitude of the error of model outputs, the 3DVar DA process may at the same time violate or increase its consistency to degrade or improve the effects of MOS from case to case. Therefore, such a phenomenon is uncorrelated with the inherent or necessary nature of the model, DA and MOS processes, and will be changed randomly whenever the three change their setup. The assumption is supported by the evidence that the results are very different when the whole experiment is replicated but with the spatial resolution of the original anthropogenic emissions changed from 0.1°× 0.1° to 0.25°× 0.25° [see Fig. 10 (PM2.5 and PM10 not plotted given the problem for them was not obvious in Fig. 9)]. For example, in Fig. 9, SO2 is predicted better by Sim_DA than Sim_base at the second forecast hour, but Sim_DM is beaten by Sim_MOS. However, in Fig. 10, the same comparison leads to an inverse result, which shows Sim_DM performs better than Sim_MOS. Given the fact that the experiments cover a period of only 1 month, it is possible that the forecast ability of Sim_MOS is slightly worse than, or almost the same as, Sim_DM, if the experiment is carried out over a longer time. However, results from short-term experiments, which contain random error just like everyday forecast, still demonstrate that using MOS and DA together does not guarantee a better output than MOS only, which should be carefully considered by researchers and forecasters working in this field.
Figure10. RMSE of three species at different forecast hours from -12 h to 72 h from the same four experiments with solely emissions perturbed.