删除或更新信息,请邮件至freekaoyan#163.com(#换成@)

The Relationship between Deterministic and Ensemble Mean Forecast Errors Revealed by Global and Loca

本站小编 Free考研考试/2022-01-02

Jie FENG1,
Jianping LI2,3,*,,,
Jing ZHANG4,
Deqiang LIU5,6,
Ruiqiang DING7,8

Corresponding author: Jianping LI,ljp@bnu.edu.cn;
1.School of Meteorology, University of Oklahoma, Norman, OK 73072, USA
2.College of Global Change and Earth System Science (GCESS), Beijing Normal University, Beijing 100875, China
3.Laboratory for Regional Oceanography and Numerical Modeling, Qingdao National Laboratory for Marine Science and Technology, Qingdao 266237, China
4.Cooperative Institute for Research in the Atmosphere, GSD/ESRL/OAR/NOAA, Boulder, CO 80305, USA
5.Fujian Meteorological Observatory, Fuzhou 350001, China
6.Wuyishan National Park Meteorological Observatory, Wuyishan 354306, China
7.State Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics, Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China
8.Plateau Atmosphere and Environment Key Laboratory of Sichuan Province, Chengdu University of Information Technology, Chengdu 610225, China
Manuscript received: 2018-06-06
Manuscript revised: 2018-09-07
Manuscript accepted: 2018-10-10
Abstract:It has been demonstrated that ensemble mean forecasts, in the context of the sample mean, have higher forecasting skill than deterministic (or single) forecasts. However, few studies have focused on quantifying the relationship between their forecast errors, especially in individual prediction cases. Clarification of the characteristics of deterministic and ensemble mean forecasts from the perspective of attractors of dynamical systems has also rarely been involved. In this paper, two attractor statistics——namely, the global and local attractor radii (GAR and LAR, respectively)——are applied to reveal the relationship between deterministic and ensemble mean forecast errors. The practical forecast experiments are implemented in a perfect model scenario with the Lorenz96 model as the numerical results for verification. The sample mean errors of deterministic and ensemble mean forecasts can be expressed by GAR and LAR, respectively, and their ratio is found to approach $\sqrt{2}$ with lead time. Meanwhile, the LAR can provide the expected ratio of the ensemble mean and deterministic forecast errors in individual cases.
Keywords: attractor radius,
ensemble forecasting,
ensemble mean,
forecast error saturation
摘要:前人研究表明集合平均预报在大样本平均的情况下比确定性(或单一)预报有更高的预报技巧。然而,很少研究关注它们预报误差之间的定量关系,尤其在一些个例预报中。同时,从动力系统吸引子的角度对确定性和集合平均预报的特征进行的研究也很少。本文利用吸引子的两个统计量即全局和局部吸引子半径来揭示确定性和集合平均预报误差的关系。基于Lorenz96模型的完美模式情景下的实际预报试验结果用来作为理论的检验。确定性预报和集合平均预报的样本平均误差可以分别用全局和局部吸引子半径来表达,它们的比值随着预报时间接近$\sqrt{2}$。同时,局部吸引子半径提供了确定性和集合平均预报误差在不同个例中的期望比值。
关键词:吸引子半径,
集合预报,
集合平均,
预报误差饱和





--> --> -->
1. Introduction
The atmospheric model, as a nonlinear chaotic system, shows sensitivity to initial and model related errors. In other words, any arbitrarily small errors in initial conditions will grow with time, finally causing a loss of most forecast information (Thompson, 1957; Lorenz, 1963, Lorenz, 1965). The deterministic (or single) forecast, as the most widely used product in weather forecasting, can provide useful and informative prediction within a certain range of predictability limit, but is unable to give quantitative estimation of the prediction reliability. Instead, ensemble forecasting is a feasible approach to supply quantitative reliability information for forecasts in the form of probability (Leith, 1974; Toth and Kalnay, 1993, Toth and Kalnay, 1997; Molteni et al., 1996). Another general advantage of ensemble forecasting over deterministic forecasts is the forecast errors can be efficiently reduced by nonlinear filtering, in which the arithmetic mean of an ensemble of forecasts is taken (Leith, 1974; Szunyogh and Toth, 2002).
Over the past several decades, numerous studies have demonstrated the overall higher forecasting skill of the ensemble mean over that of deterministic forecasts by using numerical models with different degrees of complexity (Houtekamer and Derome, 1995; Toth and Kalnay, 1997; Buizza et al., 1999; Wang and Bishop, 2003; Wei et al., 2008; Zheng et al., 2009; Feng et al., 2014; Duan and Huo, 2016). However, quantitative estimations and comparisons of sample-mean deterministic and ensemble mean forecast errors are subject to sampling errors as a result of limited numbers of forecast samples and ensemble members, especially in complex numerical weather prediction models. It is even more challenging to compare the deterministic and ensemble mean forecasting skill for specific weather and climate events, due to the sampling uncertainties from the day-to-day variation of the underlying flow (Toth and Kalnay, 1997; Corazza et al., 2003).
The forecasts and the corresponding verifying references (generally the analysis states) are the evolving states of the model and reference attractors, respectively. Therefore, the forecast errors essentially are the distances between states in attractor space. (Li et al., 2018) proposed two statistics, i.e., the global and local attractor radii (GAR and LAR, respectively), with regard to the average distances between states on attractors. GAR measures the average distance between two randomly selected states on an attractor, while LAR quantifies the average distance of all states on the attractor from a given state. For complex nonlinear dynamical systems, e.g., the atmosphere, GAR and LAR can be estimated simply using a long time series of their observed states. Moreover, GAR is found to be a more accurate criterion to measure the predictability limit than the traditional saturated value of the sample mean deterministic forecast errors. The latter, due to model errors, usually overestimates the actual error size that totally chaotic forecasts should have on average (Li and Ding, 2015; Li et al., 2018). In our study, GAR and LAR will be further used to interpret the differences of deterministic and ensemble mean forecast errors in both sample-mean and single-case contexts without running the numerical forecasts. It is expected to supply a reference for the verification and assessment of the skill of deterministic and ensemble mean forecasts.
The paper is organized as follows: Section 2 briefly introduces the definitions of GAR and LAR used in this study and the relevant theories. The experimental setup is presented in section 3. Section 4 displays and analyzes the roles of the attractor statistics in interpreting the relationship between deterministic and ensemble mean forecast errors. A discussion and conclusions are provided in section 5.

2. Definitions of GAR and LAR
The definitions of GAR and LAR are based on the premise that a compact attractor has an invariant probability density function and marginal density function. Following (Li et al., 2018), consider x=(x1,x2,…,xn) to be the state vector on a compact attractor $\mathcal{A}$. R L,i is the LAR of one given state xi on attractor A, defined by \begin{equation} \label{eq1} R_{{\rm L},i}=R_{\rm L}({x}_i)=\sqrt{E(\|{x}_i-{x}\|^2)},\quad {x}_i,{x}\in \mathcal{A} , \ \ (1)\end{equation} where E($\cdot$) represents the mathematical expectation and \|$\cdot$\| is the L2 norm of a vector. Geometrically, LAR measures the average root-mean-square (RMS) distance of all states on the attractor from a given state. Assuming that $\mathcal{A}$ does not vary with time, for a specific state xi, R L,i is an invariant quantity. Particularly, if xi is chosen to be the mean state x E of $\mathcal{A}$, R E is calculated by $\sqrt{E(\|x_{E}-x\|^2)}$ and is defined as the attractor radius. It has the same form as the standard deviation (SD) in statistics, measuring the variability of a variable.
Theorem 1: Let di and dj denote the RMS distances of two states xi and xj on $\mathcal{A}$ from the mean state x E. Let R L,i and R L,j represent the LAR of xi and xj, respectively. Then, they satisfy the following relationship: \begin{equation} R_{{\rm L},i}>R_{{\rm L},j},\ {\rm if}\ d_{i}>d_{j} . \ \ (2)\end{equation}
This means that the minimum value of LAR is exactly the attractor radius. The proof of Theorem 1 can be referred to in the Appendix.
The RMS of LARs over all states on $\mathcal{A}$ is defined as the GAR: \begin{equation} \label{eq2} R_{\rm G}=\sqrt{E(R_{\rm L}^2)}=\sqrt{E(\|{x}-{y}\|^2)} , \ \ (3)\end{equation} where x and y are two randomly selected state vectors from $\mathcal{A}$. GAR is an estimate of the average RMS distance between any two states on the same attractor space.
Theorem 2: A constant proportional relationship between R G and attractor radius R E of a compact attractor $\mathcal{A}$ exists as: \begin{equation} \label{eq3} R_{\rm G}=\sqrt{2}R_{\rm E} . \ \ (4)\end{equation}
The two statistics, GAR and LAR, and their relevant theorems will be applied to the quantitative estimation of deterministic and ensemble mean forecast errors.

3. Experimental setup
In our experiments, the simple Lorenz96 model (Lorenz, 1996) is used so that a large sample of ensemble and forecasts can be generated with low computational cost to significantly reduce the sampling noise of the estimation of forecast errors. The Lorenz96 model is a 40-variable model and has been widely used to investigate theorems and methods of ensemble prediction and data assimilation (e.g., Lorenz and Emanuel, 1998; Ott et al., 2004; Basnarkov and Kocarev, 2012; Feng et al., 2014). The model is given by: \begin{equation} \label{eq4} dx_k/dt=(x_{k+1}-x_{k-2})x_{k-1}-x_k+F , \ \ (5)\end{equation} where xk(k=1,2,…,40) represents the state variable and F is a forcing constant. It assumes that x-1=x39, x0=x40 and x41=x1. In the case of F=8, the model has chaotic behavior. The solutions can be solved by a fourth-order Runge-Kutta scheme with a time step of 0.05 time units (tu).
After an initial spin-up stage of 1000 tu, the model is naturally run for a sufficiently long time (104 tu, i.e., 2× 105 time steps) to generate the true states used as references for the forecasts. There are a total of 2× 105 cases initiated from each true state. If the initial true state is denoted by x t, the initial analysis state x a is given by superposing analysis errors δ on xt: \begin{equation} {x}_{\rm a}={x}_{\rm t}+{\delta} . \ \ (6)\end{equation}
For simplicity, each element of the analysis error δ is arbitrarily generated from the Gaussian distribution with expectation 0 and SD 1. The RMS size of δ is then rescaled to 0.1, which is about 3% of the climatic SD of x t (3.63). Each ensemble perturbation is generated with the same approach as the analysis error but using different realizations of noise and the RMS size of each perturbation is rescaled to 0.1 as well. There are a total of 2.5× 105 ensemble perturbations produced in each case and added and subtracted from the analysis x a to generate N=5× 105 initial ensemble members (2.5× 105 pairs), making their mean still equal to x a. The deterministic and ensemble forecasts in each case are derived by integrating analysis states and initial ensemble members for 10 tu using the same model generating the truth (i.e., perfect model scenario).
Although the analyses in our study are not generated through generally used data assimilation approaches, the initial ensemble perturbations have the same probability distribution as the analysis errors and thus are expected to optimally sample the analysis errors. Moreover, the ensemble member number (5× 105) is significantly larger than the model dimension (40). The above two designs eliminate the possible effects from suboptimal initial ensemble members and a limited number of ensembles on the ensemble mean skill.

4. Results
GAR and LAR of the Lorenz96 model
Due to the ergodicity of attractors, the evolving states of chaotic dynamical systems have stable probability to visit different regions of the attractor in the long run (Farmer et al., 1983; Eckmann and Ruelle, 1985; Zou et al., 1985; Li and Chou, 1997). Initially, the probability distribution of the attractor of the Lorenz96 system is given to display the long-term behavior of the system. Figure 1 shows the probability distribution of variable x1 (the selection of xi has no effect on the results, because of their homogeneous properties in the Lorenz96 model). It is evident that the probability distribution of variable x1 tends to be invariant with the evolving time increased to sufficiently long (2.5× 106 tu here). The mean value and the SD of the attractor are 2.22 and 3.63, respectively.
Figure1. Probability (%) distribution of variable x1 of the Lorenz96 model over the attractor set with different lengths of time series (5× 102, 2.5× 103, 5× 103, 2.5× 104, 5× 104, 5× 105, 2.5× 106 tu, in order).


Figure 2 shows the variation of LAR (red solid line) as a function of the value of x1 calculated with a 2.5× 106 tu time series. The probability distribution of the system (black solid line) is also given as a reference. It is found that LAR is dependent on the specific state on the attractor. The states with longer distances to the mean state have smaller probability to occur and larger LAR, as in Theorem 1. When x1 moves to the mean state, the minimal value of LAR——namely, the attractor radius R E——is exactly reached and equal to the SD (3.63). Additionally, the R G of variable x1 (5.13), calculated through the RMS of R L over all given states on the attractor, is exactly $\sqrt 2$ times the R E, as revealed by Theorem 2.
Figure2. Variation of LAR (red solid line) as a function of the value of x1 and the probability distribution of variable x1 (black solid line). The x1 value with the lowest LAR (red dashed line) is exactly the same as the mean state (black dashed line) 2.22.



2
4.2. Evolution of ensemble mean and deterministic forecast states
--> The differences between deterministic and ensemble mean forecast errors are essentially associated with their differing forecast states. Therefore, the statistical characteristics of deterministic and ensemble mean forecast states are analyzed before comparing their forecast errors. Each panel of Fig. 3 illustrates the probability distribution of the deterministic (black line) and ensemble mean (blue line) forecast states over all cases at the same lead time. It shows that the probability distribution of deterministic forecasts is always consistent with that of the reference (red line) from 0.5 to 6 tu, since they are from the same attractor. In contrast, the probability distribution of ensemble mean states appears to have a narrower range and a higher peak as time increases. In other words, the ensemble mean forecasts tend to, on the whole, move toward the climatic mean value (2.22) with lead time because of the nonlinear smoothing effect of the arithmetic mean of the forecast ensemble (Toth and Kalnay, 1997). Finally, when all forecast members become chaotic with sufficiently long lead time, their ensemble mean without exception would equal the climatic mean in any individual case. It indicates that the ensemble mean reduces the forecast error compared to deterministic forecasts, but at the expense of losing information and variability in forecasts. On the other hand, according to the characteristics of forecast states, it could be expected that the saturation value of sample mean ensemble mean forecast errors would be consistent with the attractor radius, while deterministic forecast errors will saturate at the level of GAR. The conclusion is verified through the results of forecast experiments in section 4.3.
Figure3. Probability (%) distribution of the ensemble mean (blue line; right-hand scale) and deterministic (black line; left-hand scale) forecast states and true states (red line; left-hand scale) over all 2× 105 samples as a function of lead time.



2
4.3. Sample mean forecast errors
--> Figure 4 shows the RMS error of x1 for deterministic and ensemble mean forecasts as a function of lead time averaged over all cases. Within the initial 1 tu, the deterministic and ensemble mean forecasts have similar errors due to the offset of the approximate linear growth of the positive and negative initial ensemble perturbations. After 1 tu, ensemble mean forecasts retain smaller errors compared to deterministic ones, and their difference continuously increases with lead time. Finally, deterministic and ensemble mean forecast errors both enter the nonlinear saturation stage and reach 5.13 and 3.63, respectively. The former is the same as the GAR and the latter equals the attractor radius. Their ratio of the saturation values is $\sqrt 2$, as is derived in section 4.2. It is also consistent with the conclusions in (Leith, 1974) and (Kalnay, 2003).
Figure4. RMS error averaged over 2× 105 samples for the deterministic (black solid line) and ensemble mean (red solid line) forecasts as a function of time. The dashed lines are the saturation values, 5.13 and 3.63, of deterministic and ensemble mean forecasts errors, respectively.



2
4.4. Forecast errors in individual cases
--> In comparison with the sample mean forecasts, the forecasts of a specific weather or climate event is strongly influenced by the evolving dynamics (Ziehmann et al., 2000; Corazza et al., 2003), and it is thus difficult to estimate the expected values of both deterministic and ensemble mean forecast errors. LAR is a feasible statistic to estimate the expected value of deterministic and ensemble mean forecast errors in individual cases without running practical forecasts. As the nonlinearity in forecasts intensifies, the ensemble mean approaches the mean state (see Fig. 3), while the deterministic forecast tends to be a random state on the attractor. Referring to the definition of LAR in Eq. (2), the ratio r of the expected values of deterministic and ensemble mean forecast errors for a specific predicted state xi can be expressed by: \begin{equation} r=\frac{R_{{\rm L},i}}{\|{\textbf{x}}_i-{\textbf{x}}_{\rm E}\|}=\frac{\sqrt{\|{\textbf{x}}_i-{\textbf{x}}_{\rm E}\|^2+R_{\rm E}^2}}{\|{\textbf{x}}_i-{\textbf{x}}_{\rm E}\|} . \ \ (7)\end{equation}
Figure 5 shows the variation of r as a function the true state x1. It can be seen that the ensemble mean has the maximum advantage over the deterministic forecast if the truth (or the observed state) is close to the climatic mean state. When the truth gradually deviates from the mean state, the superiority of the ensemble mean over deterministic forecasts diminishes fast. For an event within 1 to 2 SD, r ranges approximately from 0.7 to 0.9. Once the event is out of 2 SD, r is almost 0.95, which means the ensemble mean and deterministic forecasts perform very similarly. This indicates that the ensemble mean has no advantage over deterministic forecasts in predicting the variabilities of extreme events, and the overall better performance of the former (see Fig. 4) originates from its higher skill for neutral events. With a long-term series of a variable, its distribution of r can be estimated in advance and used as a reference for deterministic and ensemble mean forecast skill in individual cases, especially for long-range forecasts.
Figure5. Ratio between the expected values of the ensemble mean (e_EM) and deterministic (e_Det) forecast errors of x1 as a function of the observed value of x1. The red dashed line represents the mean state and the black dashed lines are 1 and 2 SD, respectively.


To verify the above result, the practical errors of deterministic and ensemble mean forecasts are compared. The forecast skills are assessed against the truth divided into three categories——namely, the neutral (within 1 climatic SD), weak extreme (within 1-2 SD), and strong extreme (beyond 2 SD) events. Figure 6 compares the deterministic and ensemble mean forecast errors for the three groups of events at lead times of 1, 2, 3 and 4 tu. It can be seen that at 1 tu the deterministic and ensemble mean forecast errors are within similar ranges; at later times, the range of the ensemble mean errors, due to the nonlinear filtering, is evidently smaller than that of the deterministic forecast errors. After 1 tu, for both the deterministic and ensemble mean forecasts, the forecast errors of an extreme event are overall larger than those of a neutral event at the same lead time, as shown in Table 1, which is essentially related to the distribution of LAR on an attractor. At long lead time (4 tu), the ratios between the average ensemble mean and deterministic forecast errors are 0.54 (1.69 vs 3.11), 0.87 (4.19 vs 4.79) and 0.99 (7.23 vs 7.33) for neutral, weak and strong extreme events, respectively, which are within the range of the expected ratio in Fig. 5. At shorter lead times, the errors of deterministic and ensemble mean forecasts become closer for neutral and weak extreme events, but the ensemble mean performs much worse (about a 20% error increase at 1 and 2 tu) for strong extreme events. For more extreme events at a given lead time, the ensemble mean forecasts are less likely to have small RMS errors, especially for longer lead times (see Figs. 6c, f, i and l).
Figure6. RMS error of the ensemble mean and the deterministic forecasts for neutral events within one SD at time (a) 1 tu, (d) 2 tu, (g) 3 tu and (j) 4 tu. The second and third columns are same as the first, but for weak extreme states within 1-2 SD and strong extreme states out of 3 SD, respectively.



5. Discussion and conclusions
In this study we investigate the quantitative relationship between the forecast errors of deterministic and ensemble mean forecasts using the Lorenz96 model as an example. Instead of evaluating the results from a large number of forecast samples as most studies do, the skills of deterministic and ensemble mean forecasts are compared by using two statistics defined on attractors, namely the global and local attractor radii (GAR and LAR, respectively). GAR and LAR quantitatively describe the average distances among states on the same attractors, which are found closely related to forecast errors. The sample mean saturated errors of deterministic and ensemble mean forecasts with a perfect model can be approximately estimated by the GAR and attractor radius, respectively, and their ratio equals $\sqrt 2$. Moreover, the expected ratio between deterministic and ensemble mean forecast errors in individual cases can be quantified by the LAR-related statistics. The results indicate that the superiority of the ensemble mean over deterministic forecasts significantly reduces from predicting neutral to strong extreme events.
GAR and LAR can be applied to practical weather and climate predictions. Since GAR and LAR are independent of specific forecast models, but derived from the attractor of observed states, they can provide objective and accurate criteria for quantifying the predictability of sample mean forecasts and individual cases in operations, respectively. The deviations of GAR and LAR between observed and practical model states may indicate the level of model deficiencies and give guidance on the development of model performance.
The relative performance of deterministic and ensemble mean forecasts revealed by GAR and LAR will not change for practical weather and climate forecasts with model errors. However, GAR and LAR calculated based on the observed states may introduce bias when used to estimate the expected errors of deterministic and ensemble mean forecasts in imperfect prediction models. It may be more appropriate to use the other two statistics on attractors introduced by (Li et al., 2018)——namely, the global and local average distances (GAD and LAD, respectively), which are similar to GAR and LAR, respectively, but estimate the average distance of states on two different attractors. The application of GAD and LAD to practical deterministic and ensemble mean forecasts will be further studied in the future.
Since the occurrence of neutral events carries large probability, the ensemble mean can still provide a valuable reference for most of the time. However, the filtering effect of the ensemble mean algorithm results in its inherent disadvantage for predicting extreme events, which cannot be easily overcome. In operations, each ensemble forecast usually has not only the amplitude but also the positional errors when predicting specific flow patterns, e.g., a trough. Therefore, the ensemble mean may have stronger smoothing effects than our theoretical results in a simple model, and thus becomes more incapable of capturing extreme flow features. To identify extreme weather, the model performance of deterministic forecasts needs further improvement toward a higher spatial resolution and more accurate model physics and parameterization. Additionally, more efficient post-processing methods for ensemble forecast members need to be developed to extract more accurate probability forecast information.

3
APPENDIX
--> This appendix shows the processes to prove Theorem 1. R L,i and R L,j are the local attractor radii of the compact attractor $\mathcal{A}$ at state xi and xj, respectively. x E and R E are the mean state and attractor radii of $\mathcal{A}$. Based on Eq. (2), the expression of R L,i can be derived as follows: \begin{eqnarray*} R_{{\rm L},i}^2&=&E(\|{x}_i-{x}\|^2) ,\quad {x}_i,{x}\in \mathcal{A} ,\\ &=&E({x}^2-2{x}{x}_i+{x}_i^2) ,\\ &=&E({x}^2)-2{x}_iE({x})+{x}_i^2 ,\\ &=&{x}_i^2-2{x}_{\rm E}{x}_i+({x}_{\rm E}^2+R_{\rm E}^2) ,\\ &=&({x}_i-{x}_{\rm E})^2+R_{\rm E}^2 . \end{eqnarray*}
R L,i reaches the minimal value R E, i.e., the attractor radius, when xi=x E; and if di>dj, R L,i>R L,j, where di and dj denote the RMS distances of xi and xj from the mean state x E.

相关话题/Relationship between Deterministic