删除或更新信息,请邮件至freekaoyan#163.com(#换成@)

Predictability of Ensemble Forecasting Estimated Using the Kullback-Leibler Divergence in the Lorenz

本站小编 Free考研考试/2022-01-02

Ruiqiang DING1,2,*,,,
Baojia LIU3,
Bin GU4,
Jianping LI5,
Xuan LI1,2

Corresponding author: Ruiqiang DING,drq@mail.iap.ac.cn;
1.State Key Laboratory of Numerical Modeling for Atmospheric Sciences and Geophysical Fluid Dynamics, Institute of Atmospheric Physics, Chinese Academy of Sciences, Beijing 100029, China
2.College of Earth Sciences, University of Chinese Academy of Sciences, Beijing 100049, China
3.Institute of Space Weather, School of Math and Statistics, Nanjing University of Information Science and Technology, Nanjing 210044, China
4.College of Physics and Optoelectronic Engineering, Nanjing University of Information Science and Technology, Nanjing 210044, China
5.College of Global Change and Earth System Sciences, Beijing Normal University, Beijing 100875, China
Manuscript received: 2019-02-26
Manuscript revised: 2019-04-18
Manuscript accepted: 2019-04-30
Abstract:A new method to quantify the predictability limit of ensemble forecasting is presented using the Kullback-Leibler (KL) divergence (also called the relative entropy), which provides a measure of the difference between the probability distributions of ensemble forecasts and local reference (true) states. The KL divergence is applicable to a non-normal distribution of ensemble forecasts, which is a substantial improvement over the previous method using the ensemble spread. An example from the three-variable Lorenz model illustrates the effectiveness of the KL divergence, which can effectively quantify the predictability limit of ensemble forecasting. On this basis, the KL divergence is used to investigate the dependence of the predictability limit of ensemble forecasting on the initial states and the magnitude of initial errors. The local predictability limit of ensemble forecasting varies considerably with the initial states, as well as with the magnitude of initial errors. Further research is needed to examine the real-world applications of the KL divergence in measuring the predictability of ensemble weather forecasts.
Keywords: predictability,
ensemble forecasting,
Kullback-Leibler divergence
摘要:Kullback–Leibler(KL)散度(相对熵)可以定量表征集合预报和局部参考状态的概率分布之差。本文提出了利用KL散度来定量估计集合预报可预报期限的新方法。KL散度方法不但适用于概率分布呈现正态分布的集合预报,而且适用于概率分布呈现非正态分布的集合预报,比传统方法具有更广的适用性。将KL散度方法应用于Lorenz模型中,研究表明它可以有效的定量确定集合预报的可预报期限。在此基础上,利用KL散度方法研究了集合预报的可预报期限对于初始状态和初始误差大小的依赖性。结果表明,集合预报的局部可预报期限会随着初始状态以及初始误差的大小发生较大的变化。在未来研究中,我们将KL散度方法应用于定量估计真实天气集合预报的可预报性。
关键词:可预报性,
集合预报,
KL散度





--> --> -->
1. Introduction
The atmosphere is a chaotic system in which small errors in its initial state can lead to large forecast errors (Thompson, 1957; Lorenz, 1963, Lorenz, 1965; Chou, 1989; Li and Chou, 1997; Bengtsso and Hodges, 2006). We can never observe every detail of the atmosphere's initial state, either in terms of spatial coverage or accuracy of measurements, so the initial conditions from which every forecast starts are inevitably slightly inaccurate. Small errors in the initial state will be amplified, so there is always a limit to how far ahead we can predict weather events (Lorenz, 1969, Lorenz, 1996; Dalcher and Kalnay, 1987; Li and Ding, 2011). Considering that weather predictions are inherently uncertain, the concept of ensemble forecasting was proposed to provide probabilistic forecasts of the future state of the atmosphere (Epstein, 1969; Leith, 1974). The basic idea of ensemble forecasting is to produce not just one single forecast but an ensemble of many forecasts starting from slightly different initial conditions.
In contrast to a single forecast, the ensemble mean of forecasts acts as a nonlinear filter that reduces forecast error (Toth and Kalnay, 1993). In general, the ensemble mean of forecasts will, on average, have a smaller error than the error of any of the single forecasts making up the ensemble (Leith, 1974; Murphy, 1988). Most importantly, the spread between the ensemble members (also called the forecast variance), which is an estimate of the standard deviation of ensemble members with respect to the ensemble mean, provides key information on the degree of confidence in the predictions under the assumption that the outputs of the ensemble members follow a normal distribution (Barker, 1991; Buizza, 1997; Palmer et al., 1998; Zhu et al., 2002). A large (small) ensemble spread indicates more (less) uncertainty in the prediction in general. In view of its advantages, ensemble forecasting is commonly performed at most of the major operational weather prediction centers worldwide, including the National Centers for Environmental Prediction (Toth and Kalnay, 1993, Toth and Kalnay, 1997; Wei et al., 2006, Wei et al., 2008), the European Centre for Medium-Range Weather Forecasts (Molteni et al., 1996; Buizza, 1997), and the Canadian Meteorological Centre (Houtekamer et al., 1996).
Ensemble forecasting aims to provide an approximate description of the probability distribution of possible future states of the atmosphere. The probability information is typically derived by using a finite number of ensemble members. Assuming that the forecast probability distribution is normal or unimodal, the width of the distribution from forecast to forecast can be measured by the ensemble spread or variance. However, the forecast probability distribution is not always unimodal and can sometimes be bimodal or even multimodal. In this case, the ensemble spread may fail to reflect the ensemble mean skill or predictability of ensemble forecasting. As pointed out by (Whitaker and Loughe, 1998), even for a perfect ensemble the correlation between the ensemble spread and skill may be very low. In addition, the ensemble spread has limited utility as a predictor of ensemble mean skill (Houtekamer, 1993; Kumar et al., 2000; Grimit and Mass, 2002; Tang et al., 2008a).
Given that ensembles provide flow-dependent probabilistic forecasts of the future state of the atmosphere, it is more appropriate to investigate the predictability of ensemble forecasting from the standpoint of the flow-dependent probability distribution of ensemble forecasts instead of the ensemble spread. In the present study, in relation to the forecast probability distribution, we introduce the Kullback-Leibler (KL) divergence (also called the relative entropy) to measure the predictability limit of ensemble forecasting. The KL divergence is a measure of how one probability distribution diverges from a second, expected probability distribution (Kullback and Leibler, 1951), thereby enabling an estimate of the difference between the probability distributions of ensemble forecasts and local reference (true) states. By investigating the evolution of the KL divergence with time, we can quantitatively estimate the predictability limit of ensemble forecasting. In contrast to the ensemble spread, the KL divergence not only provides a quantitative measure of the predictability limit of ensemble forecasting but is applicable to a non-normal distribution of ensemble forecasts, thereby overcoming the limitations of the ensemble spread and providing an effective way to investigate the predictability of ensemble forecasting.
Note that information theory measures, such as the KL divergence or relative entropy, have been used in previous studies to measure the skill of ensemble forecasts (Stephenson and Dolas-Reyes, 2000; Roulston and Smith, 2002; DelSole, 2004, DelSole, 2005; Tang et al., 2005, Tang et al., 2008b). However, in these studies the entropy of ensemble forecasts was used as a measure or predictor of forecast skill, rather than a measure of the predictability limit. In this paper, we present a wider role of information theory in quantifying the predictability limit of ensemble forecasting, which can provide useful information on the time at which ensemble forecasts become meaningless.
The remainder of this paper is organized as follows. Section 2 provides a definition of the KL divergence and presents a method to compute the KL divergence for ensemble forecasting. Section 3 tests the validation and usefulness of the KL divergence in measuring the predictability of ensemble forecasting by applying it to a simple system——the three-variable Lorenz model. Section 4 summarizes the major results of this work and discusses possible limitations and future research.

2. Methods
2
2.1. KL divergence
--> The KL divergence measures the difference between two probability distributions P and Q (Kullback and Leibler, 1951). For discrete probability distributions P and Q, the KL divergence from Q to P is defined as \begin{equation} \label{eq1} D_{\rm KL}(P\|Q)=\sum_i{P(i)\log\frac{P(i)}{Q(i)}} , \ \ (1)\end{equation} where "\|" denotes "relative to", and Eq. (2) is equivalent to \begin{equation} \label{eq2} D_{\rm KL}(P\|Q)=-\sum_i {P(i)\log\frac{Q(i)}{P(i)}} . \ \ (2)\end{equation}
For distributions P and Q of a continuous random variable x, the KL divergence is defined as \begin{equation} \label{eq3} D_{\rm KL}(P\|Q)=\int_{-\infty}^\infty p(x)\log\frac{p(x)}{q(x)}dx , \ \ (3)\end{equation} where p and q represent the probability densities of P and Q. The KL divergence is always non-negative, with DKL(P\|Q) zero if and only if P=Q.

2
2.2. Local attractor radius
--> Let xi be a specific state on a compact attractor Ω, then the local attractor radius (LAR, RL) with respect to the state xi is defined by (Li et al., 2018) as \begin{equation} \label{eq4} R_L({x}_i)=\sqrt{E(\|{x}_i-{x}\|^2)} ,\quad {x}_i,{x}\in\Omega , \ \ (4)\end{equation}
where the norm $\| \|$ represents the L2-norm and E denotes the expectation. The LAR measures the root-mean-square distance between one specific state xi and all other states on an attractor. In terms of the LAR, the local attractor with respect to the state xi can be defined as a subset of all states on the attractor whose distance to the state xi is less than the LAR. (Li et al., 2018) showed that the LAR can be used as an objective metric to quantify the local predictability limit of forecast models. In the present study, the LAR is used to define the local attractor with respect to a specific reference state and to construct the probability distributions of local reference (true) states.
An example from the three-variable Lorenz system is given to illustrate the spatial structure of the LAR over the Lorenz attractor. The three-variable Lorenz system is \begin{equation} \label{eq5} \left\{ \begin{array}{l} \dfrac{{\rm d}X}{{\rm d}t}=-\sigma X+\sigma Y \\ \dfrac{{\rm d}Y}{{\rm d}t}=rX-Y-XZ\\ \dfrac{{\rm d}Z}{{\rm d}t}=XY-bZ \end{array} \right., \ \ (5)\end{equation} where σ=10, r=28, and b=8/3, for which the system exhibits chaotic behavior (Lorenz, 1963). Figure 1 shows a projection of the LAR over the Lorenz attractor in the x-y plane. Obviously, the LAR varies widely over the attractor, with a minimum value of the LAR of ~15 and the maximum value exceeding 35. The LAR is not randomly distributed but exhibits a distinct organization in phase space, consistent with the results of (Li et al., 2018). The LAR is antisymmetric with respect to the x- or y-axis, with minimum values at the intersection of the two wings and maximum values at the outermost rims. As the LAR varies over the attractor, the local attractor with respect to a specific state also changes with the state.
Figure1. Projection of the LAR over the Lorenz attractor in the x-y plane.



2
2.3. Calculation of the KL divergence in ensemble forecasting
--> The definition of the KL divergence in Eq. (2) aims to quantify the difference between two probability distributions, P and Q. To compute the KL divergence in ensemble forecasting, it is necessary to estimate the probability distribution of local reference (true) states (hereafter P) and the probability distribution of ensemble forecasts (hereafter Q). For a specific reference state xi, we first calculate the LAR of the state xi. Then, we can obtain the subset of all states on the attractor whose distance to the reference state is less than the LAR. Finally, the probability distribution P of local reference (true) states can be obtained based on the subset of the states on the local attractor.
When N random perturbations are added to or subtracted from the reference states, N different results of ensemble forecasts can be generated from the prediction model. Based on N ensemble forecasts, the probability distribution Q of ensemble forecasts can then be obtained. Once both P and Q are obtained, we can directly compute the KL divergence. As the reference state and ensemble forecasts change with the forecast time, the KL divergence will vary with the forecast time. By examining the evolution of the KL divergence with the forecast time, we can quantitatively estimate the predictability limit of ensemble forecasting.

2
2.4. Nonlinear local Lyapunov exponent method
--> The nonlinear local Lyapunov exponent (NLLE), which is a nonlinear extension of the existing linear finite-time or local Lyapunov exponents (Yoden and Nomura, 1993; Boffetta et al., 1998; Ziehmann et al., 2000), measures the mean growth rate of the initial errors of nonlinear dynamical systems without having to linearize the nonlinear equations of motion (Ding and Li, 2007, Ding et al., 2008a; Li and Ding, 2011). The NLLE and its derivative (i.e., the mean relative growth of the initial error) have been widely applied to quantitatively determine the limit of dynamic predictability of weather or climate variables (Ding et al., 2008b, Ding et al., 2010, Ding et al., 2011, Ding et al., 2015), exhibiting superior performance to the existing linear finite-time or local Lyapunov exponents. A brief description of the NLLE method is given in Appendix A.
Note that the NLLE method is defined based on nonlinear error dynamics, while the KL divergence is defined based on probability and information theory. Some differences exist between both methods. For example, the NLLE method uses the root-mean-square error as the measure of error, and therefore depends on the dimension of variables. In contrast, the KL divergence uses the difference between two probability distributions as the measure of uncertainty, and therefore does not depend on the dimension of variables. This may be one advantage of the KL divergence relative to the NLLE method. Nevertheless, although the NLLE method (the KL divergence) is used to determine the predictability limit by exploring the evolution of initial errors (the evolution of forecast probability distributions), considering that the predictability limit is an intrinsic property of a given dynamical system that does not depend on specific methods (Lorenz, 1969; Mu et al., 2017), the predictability limit of ensemble forecasting derived from the KL divergence and from error evolution should be consistent (see Fig. 2). Therefore, we compare the predictability limits of ensemble forecasting derived from the KL divergence and NLLE. Their consistency would support the effectiveness of the KL divergence in measuring the predictability of ensemble forecasting.
Figure2. For the initial state on the Lorenz attractor x01 (-5.76, -0.29, 30.5), we show (a) the KL divergence and (b) the mean error growth obtained using the NLLE method with ε =10-3 as a function of time t. In (a), the time at which the KL divergence reaches its maximum value is indicated by the red dashed line. In (b), the average value of the nonlinear stochastic fluctuation states of the mean error is indicated by the black dashed line, and the time at which the error growth enters the nonlinear stochastic fluctuation states is indicated by the red dashed line.



3. Results
Taking the three-variable Lorenz model as an example, we examine the evolution of the KL divergence with forecast time t for ensemble forecasting. Starting from a randomly chosen initial state x01 (-5.76, -0.29, 30.5) on the Lorenz attractor, we first integrate the Lorenz model to obtain the long-term model states as the reference states. The local attractor with respect to each reference state can be determined from the LAR, and then the probability distribution P of local reference (true) states can be obtained based on the subset of the states on the local attractor. Note that the local attractor and its probability distribution P depend on the reference states that vary with integration time. To obtain the probability distribution Q of ensemble forecasts, we superpose N=105 initial perturbations with the same amplitude, ε=10-3, and random directions in phase space onto the initial state x01 to generate slightly different initial states. Then, ensemble predictions are made starting from these different initial states. For each forecast time, the probability distribution Q of ensemble forecasts can be obtained based on ensemble members. The KL divergence is calculated based on Eq. (2) for discrete probability distributions P and Q.
Figure 3a shows the variation in the KL divergence as a function of time t for the initial state x01 (-5.76, -0.29, 30.5). The KL divergence shows a nonuniform growth process with time. At time t= 7, the KL divergence reaches a maximum value, implying that the probability distribution Q of ensemble forecasts deviates most from the probability distribution P of local reference (true) states. At this time, the forecast distribution yields unreliable probabilistic forecasts, and the ensemble prediction can be considered meaningless. If the time at which the KL divergence reaches its maximum value is specified as the local predictability limit, the predictability limit of ensemble forecasting starting from x01 with ε=10-3 would be Tp≈ 7. For another initial state x02 (10.3, 0.92, 16.7), the KL divergence shows a similar zigzag growth process before it reaches the maximum value at around t=11 (Fig. 4a). According to the definition, the local predictability limit of ensemble forecasting starting from x02 with ε=10-3 would be Tp≈ 11, greater than the predictability limit at x01.
Figure3. Schematic illustration of the consistency of the predictability limit (denoted as Tp) derived from (a) error and (b) probability evolutions.


Figure4. As in Fig. 3 but for the other initial state on the Lorenz attractor, x02 (10.3, 0.92, 16.7).


To understand why the KL divergence varies with time, we examine the evolution of the probability distributions P (local reference states) and Q (ensemble forecasts) with time for x01 (Fig. 5). Both probability distributions P and Q change with time. At the beginning of ensemble forecasting, Q is concentrated in the center of P. Gradually, the range of Q becomes wider as ensemble perturbations tend to diverge over time. Correspondingly, Q begins to diverge from P and the KL divergence gradually becomes larger. When t=7, the difference between P and Q is significant. As a result, the KL divergence reaches its maximum value at that moment. Afterwards, Q gradually converges to the distribution of the entire attractor as ensemble perturbations expand to the entire attractor, and instead P falls within Q. Correspondingly, the KL divergence drops from the peak and then enters the nonlinear stochastic fluctuation phase. In addition, we note in Fig. 5 that ensemble forecasts do not follow a normal distribution for each forecast time, implying that the ensemble spread may not be appropriate to provide an accurate measure of the predictability of ensemble forecasting. In contrast, the application of the KL divergence in this paper excludes the influence of the type of probability distribution, and therefore ensures the accuracy of estimates of the predictability of ensemble forecasting.
Figure5. Evolution of the probability distributions P (local reference states; black line; left axis) and Q (ensemble forecasts; red line; right axis) with time for x01 (-5.76, -0.29, 30.5).


We now examine the evolution of the local attractor (green points in Fig. 6) and ensemble forecast states (red points in Fig. 6) starting from x01 over the entire Lorenz attractor. At the beginning of the ensemble forecast, all ensemble forecast states fall within the local attractor. As the prediction time increases, ensemble forecast states begin to fall outside the local attractor, and gradually expand to the entire attractor. When t=7, almost all ensemble forecast states fall completely outside the local attractor, and the prediction subsequently becomes meaningless as the KL divergence reaches its maximum value (Li et al., 2018). Therefore, it is reasonable to use the maximum value of the KL divergence to measure the predictability of ensemble forecasting.
Figure6. Evolution of the local attractor (green points) and ensemble forecast states (red points) starting from x01 (-5.76, -0.29, 30.5) over the entire Lorenz attractor (gray points).


The predictability of ensemble forecasting derived from the probability and error evolutions should be consistent. It is interesting to compare the local predictability limit obtained using the KL divergence and NLLE. Figure 3b shows the ensemble mean error growth over 105 initial random perturbations obtained using the NLLE method for the initial state x01 and ε=10-3. The mean error initially shows an oscillating growth, and finally stops increasing and enters the nonlinear stochastic oscillation regime with a constant average value. Once the error growth enters the nonlinear stochastic oscillation regime, almost all predictability is lost and the prediction becomes meaningless. Following the work of (Ding et al., 2008b), we determine the local predictability limit as the time at which the mean error reaches the average value of the nonlinear stochastic fluctuation states. Then, we find that the predictability limit at x01 with ε=10-3 calculated using the NLLE method is Tp≈ 7, which is consistent with the predictability limit derived from the KL divergence. Similarly, the predictability limit at x02 with ε=10-3 calculated using the NLLE method is Tp≈ 11 (Fig. 4b), which is also consistent with the predictability limit derived from the KL divergence. The consistency across methods lends support to the effectiveness of the KL divergence in measuring the predictability of ensemble forecasting.
Figure 7 shows the variations in the local predictability limit of ensemble forecasting as a function of initial states xi with ε=10-3 for a typical trajectory on the Lorenz attractor. The local predictability limit of ensemble forecasting varies widely with initial state on the Lorenz attractor. For 600 initial states, we find that a minimum value of the local predictability limit is ~3.6, while the maximum value is ~16. Local predictability limits obtained using the KL divergence and NLLE closely resemble each other, with a correlation coefficient of 0.92 (significant at the 99.9% confidence level). These results indicate that the predictability of ensemble forecasting depends on the initial states of the ensemble forecast.
Figure7. Variations in the local predictability limit obtained using the KL divergence (red line) and NLLE (green line) as a function of initial states xi (i=1,2,…,600) with ε=10-3 for a typical trajectory on the Lorenz attractor.


We now consider the structure of predictability in phase space by investigating the three-dimensional distribution of the local predictability limit derived from the KL divergence (Fig. 8). The local predictability limit has a distinct organization in phase space. On the whole, the inner and outer rims of each wing of the Lorenz attractor have a relatively high local predictability limit, while the regions between the inner and outer rims of each wing have a relatively low predictability limit, consistent with the distribution of the local predictability limit derived from the NLLE method (Huai et al., 2017). (Huai et al., 2017) pointed out that this structure of the local predictability limit in phase space may be related to the local dynamics of the Lorenz attractor that affects the length of time that each point remains on the current wing, and this period of time is important in determining the local predictability limit of each point. This underlying structure allows the identification of regions in phase space of high and low predictability and may be helpful in estimating the predictability for each point.
Figure8. Three-dimensional distribution of the local predictability limit of 5000 states on the Lorenz attractor derived from the KL divergence.


The predictability of ensemble forecasting depends on the initial states as well as the magnitude of initial errors. For the analysis presented above, the magnitude ε of initial errors is fixed as 10-3. We next examine the dependence of the predictability of ensemble forecasting on the magnitude of initial errors. Figure 9 shows the local predictability limits derived from the KL divergence as a function of the magnitude of initial errors for the initial state x03 (6.03, 9.71, 16.5). As a comparison, local predictability limits derived from the NLLE method as a function of the magnitude of initial errors are also shown in Fig. 9. Local predictability limits derived from the KL divergence and NLLE decrease approximately linearly as the logarithm of the magnitude of initial errors is increased. For a specific initial error, the local predictability limit derived from the KL divergence is very close to the limit derived from the NLLE method. Similar results are obtained for initial states x01 and x02 (not shown), indicating that the predictability of ensemble forecasting is sensitive to the magnitude of initial errors.
Figure9. Local predictability limits derived from the KL divergence (solid line with dots) and NLLE (dashed line with dots) as a function of the magnitude of initial error ε for the initial state x03 (6.03, 9.71, 16.5).


Let us now consider an important question concerning the influence of the number of ensemble members on the predictability estimation of ensemble forecasting. Given that the KL divergence is obtained by computing the difference between the probability distributions of local reference states and ensemble forecasts, an accurate estimate of the KL divergence depends on an accurate estimate of the probability distributions of local reference states and ensemble forecasts. However, a sufficiently large number of ensemble members is required to accurately estimate the probability distribution of ensemble forecasts. It is likely that the method using the KL divergence to estimate the predictability of ensemble forecasting would give worse results for ensemble predictions using operational weather forecasting models, in which the number of ensemble members is usually restricted due to limitations in computing resources.
The number of ensemble members used in the present study is N=105. We examine the dependence of the estimated predictability limit of ensemble forecasting on the number of ensemble members. Figure 10 shows the estimated local predictability limit of ensemble forecasting starting from x01 as a function of the number of ensemble members. The number of ensemble members decreases from 105 to 200; the latter is close to the number of ensemble members used in current operational weather forecasting. The local predictability limit is initially almost constant, followed by a gradual decrease with decreasing number of ensemble members. This result might be expected because the estimation of the probability distributions of local reference states and ensemble forecasts would have larger uncertainties for a smaller number of ensemble members. The estimated predictability limit obtained using 200 ensemble members is Tp≈ 6.5, which is slightly lower than the limit obtained using 105 ensemble members. Similar results were obtained for other initial states (not shown). These results suggest that, although a smaller number of ensemble members tends to underestimate the predictability limit to some extent, such an underestimate is relatively slight. Consequently, it may be feasible to use a relatively small number of ensemble members to estimate the predictability of ensemble forecasting in the Lorenz model.
Figure10. Estimated local predictability limit of ensemble forecasting starting from x01 (-5.76, -0.29, 30.5) over the Lorenz attractor as a function of the number of ensemble members.


Note that the above analyses are based solely on a simple toy model: the three-variable Lorenz model. For complex weather forecasting models, the situation may be different when we try to estimate the probability distribution in a higher-dimensional space using a small number of ensemble forecasts. In this case, the probability distribution of ensemble forecasts may be poorly estimated, possibly leading to a large error in the estimation of predictability. This may be a limitation of the KL divergence. Hopefully, with increased computing resources available, the number of ensemble members can be further increased in real-world numerical weather models. Further research is required to examine the application of the KL divergence in real-world ensemble weather forecasts and to assess the influence of ensemble size on estimates of predictability.
We then consider another important question regarding the influence of model errors on the accurate estimation of the KL divergence. This study simply uses the Lorenz model without model error. Given the existence of model error, the probability distribution of true states P is generally unknown. If the forecast states are used instead of true states, the KL divergence and hence the estimated predictability limit would possibly include an error. For the Lorenz attractor, the local attractor with respect to a given state is not sensitive to the state itself (see Fig. 1), and the local attractor and its probability distribution of nearby states are similar. Consequently, a small error in the Lorenz model would produce a relatively small initial error in the KL divergence. In real-world ensemble weather forecasts, although models are imperfect, a large amount of observed atmospheric data is available. We can use observations to estimate the probability distribution of true states; this remains a topic for future research.

4. Conclusions
We have presented a new method using the KL divergence to measure the predictability of ensemble forecasting. The KL divergence allows us to estimate the difference between the probability distributions of ensemble forecasts and local reference (true) states. By investigating the evolution of the KL divergence with time, the local predictability limit of ensemble forecasting may be quantitatively determined. The KL divergence is applicable to a non-normal distribution of ensemble forecasts. This represents an improvement over the ensemble spread, which is only applicable under the assumption that the ensemble members follow a normal distribution. Using the KL divergence, we have performed a quantitative analysis of the predictability of ensemble forecasting in the Lorenz model. The local predictability limit derived from the KL divergence is clearly consistent with that derived from error evolution, lending support to the effectiveness of the KL divergence in measuring the predictability of ensemble forecasting.
In addition, we have investigated the sensitivity of the predictability of ensemble forecasting to the initial states and the magnitude of initial errors. We found that the predictability of ensemble forecasting depends on the initial states as well as on the magnitude of initial errors. The local predictability limit of ensemble forecasting varies considerably with time, but the predictability variability shows organization in phase space. The predictability of ensemble forecasting is also sensitive to the magnitude of initial errors. The local predictability limit decreases approximately linearly as the logarithm of the magnitude of initial errors is increased.
Our study presents a preliminary application of the KL divergence in measuring the predictability of ensemble forecasting in a relatively simple system. For more complex ensemble weather or climate forecasts, there will be higher dimensionality and more complicated models. This implies that there would exist some uncertainties in estimating the KL divergence for operational weather or climate forecasts, which poses a challenge to the accurate estimation of the KL divergence. It would be interesting to extend the current investigation to more realistic ensemble weather forecasts, which we intend to examine in future research. In addition, this study simply used random perturbations as ensemble perturbations. Up to now, various schemes have been developed to generate the initial perturbations in ensemble forecasts, such as the bred vector method (Toth and Kalnay, 1993, 1997), the singular vector method (Molteni et al., 1996; Buizza, 1997), and the ensemble transform Kalman filter (Bishop et al., 2001; Wang and Bishop, 2003). These schemes have been shown to improve operational forecasts compared with random perturbations. It is worthwhile examining from the standpoint of the KL divergence the predictability of these ensemble forecasts using perturbations generated by such schemes.

3
APPENDIX A
--> 3
Introduction to the NLLE method
--> Consider a general n-dimensional nonlinear dynamical system whose evolution is governed by \begin{equation} \frac{{d}{x}}{dt}={F}({x}) , \ \ (A1)\end{equation}
where x=[x1(t),x2(t),…,xn(t)] T is the state vector at time t, the superscript T is the transpose, and F represents the dynamics. The evolution of a small error δ=[δ1(t),δ2(t),…,δn(t)] T, superimposed on a state x is governed by the following nonlinear equation: \begin{equation} \frac{d}{dt}{\delta}={J}({x}){\delta}+{G}({x},{\delta}) , \ \ (A2)\end{equation} where J(x)δ are the tangent linear terms and G(x,δ) are the high-order nonlinear terms of the error δ. Without a linear approximation, the solutions of Eq. (A2) can be obtained by numerical integration along the reference solution x from t=t0 to t0: \begin{equation} {\delta}_1={\eta}({x}_0,{\delta}_0,\tau){\delta}_0 , \ \ (A3) \end{equation} where δ1=δ(t0+τ), x0=x(t0), δ0=δ(t0), and η(x0,δ0,τ) is the nonlinear propagator. The NLLE is then defined as \begin{equation} \lambda({x}_0,{\delta}_0,\tau)=\frac{1}{\tau}\ln\frac{\|{\delta}_1\|}{\|{\delta}_0\|} , \ \ (A4)\end{equation} where Λ(x0,δ0,τ) depends in general on the initial state x0 in phase space, the initial error δ0, and time τ. The NLLE differs from existing local or finite-time Lyapunov exponents defined from linear error dynamics, which depend solely on the initial state x0 and time τ, and not on the initial error δ0. Assuming that all initial perturbations with amplitude ε and random directions are on an n-dimensional spherical surface centered at an initial point x0, then we have \begin{equation} {\delta}_0^{\rm T}{\delta}_0=\varepsilon^2 . \ \ (A5) \end{equation}
The local ensemble mean of the NLLE over a large number of random initial perturbations is given by \begin{equation} \bar{\lambda}({x}_0,\tau)=\langle{\lambda({x}_0,{\delta}_0,\tau)}\rangle_N , \ \ (A6)\end{equation} where $\langle\ \rangle_N$ denotes the local ensemble average of samples of large enough size N ($N\to\infty$). Here, $\bar\lambda(x_0,\tau)$ characterizes the average growth rate of random perturbations superimposed on x0 within a finite time τ. For a fixed time τ, $\bar\lambda(x_0,\tau)$ depends on x0 and reflects the local error growth dynamics of the attractor. The mean local relative growth of the initial error can be obtained by \begin{equation} \bar{E}({x}_0,\tau)=e^{[\bar{\lambda}({x}_0,\tau)\tau]} . \ \ (A7) \end{equation}
For a given initial state x0, $\bar{E}(x_0,\tau)$ initially increases with time τ and finally reaches a state of nonlinear stochastic fluctuation, which means that error growth reaches saturation with a constant average value. At that moment, almost all information on the initial state is lost and the prediction becomes meaningless. If the local predictability limit is defined as the time at which the error reaches the average value of the nonlinear stochastic fluctuation states, the predictability limit of the system at x0 can be quantitatively determined.

相关话题/Predictability Ensemble Forecasting