HTML
--> --> --> -->2.1. FY-4A GIIRS data
GIIRS is one of main instruments onboard China’s second-generation geostationary meteorological satellite, FY-4A, which was launched on 11 December 2016. It is the first precise remote sensing instrument in geostationary orbit, detecting the vertical structure of the three-dimensional atmosphere via infrared hyperspectral interference spectroscopy (Menzel et al., 2018). The primary task of GIIRS is to measure the distribution of temperature and humidity in the atmosphere. As the first hyperspectral sounder mounted on a geostationary satellite, GIIRS complements the observations of sounders on polar-orbiting satellites, providing almost continuous temporal, horizontal and vertical observations, which are especially useful for regional nowcasting and tracking extreme weather events.GIIRS uses the Fourier transform to recover the atmospheric absorption spectrum with high spectral resolution. It covers the range of the long-wavelength band (700?1130 cm?1), the mid-wavelength band (1650?2250 cm?1), and the visible light band (0.55?0.75 μm). The detectors at the two infrared wavebands both have 32 × 4 sensor elements. There are a total of 1650 infrared channels, of which 689 are for the long-wavelength band and 961 for the mid-wavelength band. The spectral resolution of the long- and mid-wavelength bands is 0.625 cm?1. The spatial resolution of the infrared and visible bands is 16 km and 2 km, respectively, and the temporal resolution is 67 min (for the China area). Table 1 lists the main performance characteristics of GIIRS.
Parameter | Performance |
Spectral bandwidth | Long wave: 700?1130 cm?1 |
Mid wave: 1650?2250 cm?1 | |
Visible: 0.55?0.75 μm | |
Spectral channels | Long wave: 689 |
Mid wave: 961 | |
Spectral resolution | Long wave: 0.625 cm?1 |
Mid wave: 0.625 cm?1 | |
Spatial resolution | Long/Mid wave: 16 km SSP* |
Visible: 2 km SSP | |
Temporal resolution | China area: 67 min |
Mesoscale area: 35 min | |
Sensitivity | Long wave: 0.5?1.1 |
Mid wave: 0.1?0.14 | |
Visible: S/N > 200 (p = 100%) | |
Calibration accuracy | Radiation: 1.5 K (3σ) |
Spectrum: 10 ppm (3σ) | |
*the sub-satellite point. |
Table1. Key GIIRS parameters.
The observational area of GIIRS in China is not fixed, but rather is based on the requirements of the Numerical Weather Prediction Center of the China Meteorological Administration (CMA). Unlike IASI, GIIRS does not perform long continuous detection on the waveband, but rather divides the spectral range into the two wavelength bands. Therefore, referring to the retrieval method of GIIRS Level-2 (L2) temperature profile products, the channel selection method proposed in this paper focuses only on the long-wavelength band, with GIIRS Level-1 data (clear-sky) from August 2018 to November 2018 selected.
2
2.2. China radiosonde data
The high resolution radiosonde data used in this study are from the China Integrated Meteorological Information Service System of the CMA. The L-band (1675 MHz) sounding radar is a new generation of secondary wind-measuring radars, which are fully automated in China and synchronized with GTSl digital electronic radiosondes (Zeng et al., 2019). It is widely used to measure the air temperature, pressure, relative humidity, and wind from the ground to about 30 km in operational radiosonde stations in China. During the balloon launching process, the sounding data were collected approximately every 1.2 s and the accuracy of the measured temperature, pressure, and humidity is 0.2°C ?0.3°C, 1?2 hPa, and 4%?5%, respectively (CMA, 2010). Compared with the previous radiosonde, Model 49, and second-generation radiosonde, Model 59, the data acquisition rate, accuracy, and reliability of the L-band radiosonde are significantly better and the GTS1 radiosonde temperature detection accuracy is comparable to the Vaisala RS80 and RS92 radiosondes. There are 120 L-band upper-air sounding systems in China, all of which provide Level-2 sounding data at 0000 UTC and 1200 UTC daily, according to the demand for high-altitude observations, in addition to encrypted observation at 0600 UTC and/or 1800 UTC depending on weather conditions or detection experiments.Radiosonde data used in this paper have undergone strict quality-control procedures by the National Meteorological Information Center of the CMA, including missing-data inspection, station climatological limit value inspection, vertical consistency checks, duplicate value checks, and internal consistency checks (Yuan et al., 2016). Corresponding to the Level-1 radiance data of GIIRS, radiosonde data from August 2018 to November 2018 were selected and include conventional observation data and encrypted observation data. The radiosonde data were used as true values to test the performance of the temperature profile retrieval model.
-->
3.1. Data matching
Before implementing data matching, each radiance value detected by GIIRS should be converted to a brightness temperature value according to the inverse Planck function:where t is the blackbody temperature (K), Lbr is the blackbody radiance (mW m?2 sr?1 cm), c1 = 119104.2 (mW m?2 sr?1 cm4), c2 = 1.4387752 (K cm?1) and v is the wavenumber (cm?1).
Then, the coordinate of each detector unit (latu, lonu) is matched with that of each L-band sounding station (lats, lons). By setting a threshold of distance on the surface of a sphere, sample pairs [(latu, lonu), (lats, lons)] that meet the following distance threshold criterion can be obtained:
where R represents the radius of the Earth (6371 km). The distance threshold is set according to the spatial resolution of GIIRS.
In addition to spatial matching, the observation time is also considered. Starting from the ground, it takes about 75 min for the sounding balloon (flight time starts at 2315 UTC, 1115 UTC, 0515 UTC or 1715 UTC) to reach a height of about 30 km (CMA, 2010). The temperature profile may simultaneously match multiple brightness temperature spectra during the balloon’s detection time. Based on the consideration of neural network robustness, these samples are preserved for training. After spatial and temporal matching, a total of 9734 sample pairs were selected. A brightness temperature spectrum detected by GIIRS corresponds to a temperature profile obtained by a radiosonde.
For each brightness temperature spectrum, we defined a range of 150?350 K to ensure its validity. Thus, any value outside this range was excluded. For sounding temperature profiles, due to the vertical resolution varying from station to station, and from sounding to sounding at the same radiosonde station, we required a more demanding selection. According to statistical analysis of the profile, we further selected the effective data for pressure levels in the range 100?900 hPa, which cover most areas in China except the Qinghai?Tibet Plateau and ensure the consistency of output layers in retrieval validation models. We then performed linear interpolation based on the detection accuracy (0.1 hPa) of the pressure level to obtain the corresponding temperature value at each level. Finally, some pressure levels [900 hPa, 875 hPa, 850 hPa, 825 hPa, 800 hPa, 775 hPa, 750 hPa, 700 hPa, 650 hPa, 600 hPa, 550 hPa, 500 hPa, 450 hPa, 400 hPa, 350 hPa, 300 hPa, 250 hPa, 225 hPa, 200 hPa, 175 hPa, 150 hPa, 125 hPa, 100 hPa] in this range were selected based on the pressure levels in the ERA5 reanalysis data.
A total of 5021 sample pairs met the requirements of the data matching and preprocessing stages. The results of the above process are shown in Fig. 1. The GIIRS infrared brightness temperature spectra in this dataset were used for channel selection and in the neural network models with the corresponding sounding temperature profiles. The specific data usage will be explained in the respective sections.
Figure1. Geographic distribution of radiosonde stations (120) of the CMA and the results of data matching. In total, 79 radiosonde stations (red dots) met the data matching requirements. The scatter size (red dots) corresponds to the number of matched sample pairs.
2
3.2. Temperature channel selection
33.2.1. Brightness temperature spectrum clustering
We consider the brightness temperature spectrum as a time-like sequence because it reflects continuous spectral information. We first clustered the brightness temperature spectrum samples using the k-means++ clustering method (Arthur and Vassilvitskii, 2007), extracting the central sequence samples for each cluster to represent the overall characteristics of each cluster for the SSA. The k-means++ clustering method is a variant of the k-means clustering method, which can effectively avoid the problem of poor performance on clustering and slow convergence due to random selection of the center point (Arthur and Vassilvitskii, 2007). The preset cluster number k has the range [0, 10], and the sum of the squared error (SSE) under different k values was calculated according to the following formula to obtain the optimal number of clusters:where k is the number of clusters, Ci is the ith cluster, p is a point of Ci, and mi is the mean of all points in Ci.
3
3.2.2. SSA
The SSA method can decompose a sequence into multiple components reflecting different implicit information, such as local energy variation and periodic variation in the original sequence (Hassani, 2007; Golyandina and Zhigljavsky, 2013). Suppose we have a central sequence S = (s1, s2, …, sN) with length N. Convert S to a trajectory matrix X consisting of lagged vectors with window length L. Then, the SVD of the trajectory matrix X can be written aswhere
Next we use w-correlation to measure the correlation between the components and divide the components with high correlation into one group. For two components of length N, Yi and Yj, and window length L, we define the weighted inner product as
where yi,k and yj,k are the kth values of Yi and Yj, respectively, and wk is given by
The weight wk reflects the number of times yi,k and yj,k appear in the trajectory matrix X of the sequence S. If
where
where
3
3.2.3. Channel selection
Channel selection for hyperspectral data can be highly beneficial both to improve the predictive ability of the model and to greatly enhance its interpretation. However, the main difficulties are not only that consecutive variables in the spectrum are highly correlated by nature, but in addition real applications usually concern databases with low numbers of known spectra, and a high number of spectral variables. The purpose of performing SSA on a brightness temperature spectrum is to find the implied information of temperature that is not easily found in the brightness temperature spectrum, but rather in the grouped components of it. The grouped components can be interpreted as the filtered and amplified results of the brightness temperature spectrum corresponding to specific requirements, such as corresponding to temperature-sensitive bands. The difference between each grouped component is maximized, ensuring that the channels selected from each grouped component have their own representatives. For the case where the information represented by each grouped component is different, we adopt a method based on each grouped component’s SD to select channels adaptively. The SD is a measure of how far the grouped component fluctuates from the mean, and the variance (SD2) represents the power of this fluctuation. Then, we combine the channels together as the final channel subset of this sequence. For a grouped component(1) Calculate the SD and use it as a threshold by
where
(2) Find the extrema, including the maxima and minima. A point ym of Y is an extremum if there are adjacent indices i and j, where i < m < j, such that ym (maximum) strictly satisfies yi < ym and ym > yj, or ym (minimum) strictly satisfies yi > ym and ym < yj. All extrema of Y, called feature points, will be further selected by the SD threshold and satisfy ym > SD or ym < ?SD.
By using the SD as a threshold for screening, we obtain the indices of the feature points of this grouped component, with each index corresponding to its respective wavenumber, and will be considered as a channel used to retrieve temperature profiles. The channels of other grouped components can also be selected following the same procedure.
2
3.3. ANN
ANNs are widely used in nonlinear regression (Feng et al., 2017; van Gerven and Bohte, 2017) and pattern recognition models (Iglovikov et al., 2017). Many researchers are also using ANNs and their variants to retrieve atmospheric parameters (Blackwell, 2005; Chakraborty and Maitra, 2016; Whitburn et al., 2016) or surface parameters (van Damme et al., 2017; Ge et al., 2018). The ANN retrieval method does not aim to explicitly formulate the physical processes linking the satellite observations to the atmospheric state, but instead creates a model of the nonlinear statistical relationship between them. Contrary to the Radiative Transfer Model, an ANN does not rely on knowledge of the physical processes and requires less a priori knowledge of the atmospheric characteristics and radiative transfer parameters. Instead, it provides the best model to combine the atmospheric information provided by the satellite data inputs (Kolassa et al., 2017). In light of the above, a typical neural network with a single hidden layer is used to validate the performance of the temperature profile retrieval, with the final channel subset as its input. There are 23 fixed output nodes, each of which corresponds to the temperature value of a pressure level. The min-max scaling method was used to normalize inputs/outputs to fall in the range [?1, 1]. Nodes in the hidden and output layers apply the sigmoidal activation function and linear activation function, respectively. The ANN was trained using the Levenberg?Marquardt back-propagation algorithm (Moré, 1978). Figure 2 shows the model architecture.Figure2. The architecture of the ANN model. The brightness temperature spectrum sample has 689 channels in the long-wavelength band (700?1130 cm?1) with a spectral resolution of 0.625 cm?1. The sounding temperature profile sample has a total of 23 selected pressure levels between 100 and 900 hPa, and each pressure level corresponds to a measured temperature value.
The neural network model of the temperature profile retrieval was developed based on a training set and validation set, while the performance of the neural network was evaluated using a testing set. The training, validation, and testing sets comprised 60%, 20%, and 20% of the sample pairs, respectively. The network training stops when the number of validation checks reaches 12, which represents the number of successive iterations above which the validation performance does not increase (Beale et al., 2018). All the nodes of the output layer are combined to produce a complete predicted temperature profile, which is compared with the target (the true temperature profile in the sample pairs of the training set). Each node error can be calculated by
where yi and
The performance of the retrieval depends on the architecture of the ANN model—in particular, the number of nodes in the input layer and the hidden layer. The former is determined by different channel subsets, including the temperature channel subset of IASI, CrIS, and our channel subset selected based on the SSA method. The candidate number of hidden layer neurons (Table 2) is based on results from previous studies (Wanas et al., 1998; Shibata and Ikeda, 2009; Beale et al., 2018).
Approach | Criterion | Hidden neuron number |
Matlab nntool default setting approach | ? | 10 |
Nayer Wanas & Gasser Auda approach | ${\log _2}T$ | 12 |
Katsunari Shibata & Yusuke Ikeda approach | $\sqrt {{{\left({{N_c}} \right)}_{\rm{i}}}{N_{\rm{o}}}} $ | 23 |
Proposed approach | $\sqrt T $ | 55 |
Table2. Hidden layer neuron settings of the neural network. T is the number of samples used for training, which in this case is 3012.
The performance of the retrieval model was quantified using several statistical error evaluation indices, including the mean absolute error (MAE), root-mean-square error (RMSE) and coefficient of determination (R2).
-->
5.1. Temperature channel subset of GIIRS
We combined the channels selected by each cluster’s own central sequence to achieve the final channel subset of GIIRS. To verify this subset of channels is available for the retrieval application, we compared it with two other sets of temperature channels—that of IASI and that of CrIS. However, a direct comparison is not trivial, as the IASI temperature channels do not have exactly the same frequencies as those of GIIRS, and the channels were chosen based on different criteria. To aid the comparison, the spectral resolution of GIIRS was linearly interpolated so that it is identical to the spectral resolution of IASI. No processing was required with CrIS because it has the same spectral resolution as GIIRS in the long-wavelength band.Figure 8 compares the temperature channels selected from GIIRS with those of IASI and CrIS in the range 700?1130 cm?1. The channels outside this interval are excluded, because these channels are not within the long-wavelength detection range of GIIRS. The number of channels of GIIRS, IASI and CrIS is 106, 49 and 23, respectively. For GIIRS, there are 89 temperature channels distributed at 700?780 cm?1 (Table 3), and 17 temperature channels distributed at 1000?1130 cm?1. The channel subset of GIIRS can reflect more temperature-sensitive information than the other two channel subsets.
Index | Wavenumber | Notes |
2 | 701.250 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
3 | 701.875 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
4 | 702.500 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
5 | 703.125 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
6 | 703.750 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
8 | 705.000 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
9 | 705.625 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
10 | 706.250 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
11 | 706.875 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
13 | 708.125 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
14 | 708.750 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
15 | 709.375 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
16 | 710.000 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
17 | 710.625 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
19 | 711.875 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
20 | 712.500 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
21 | 713.125 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
22 | 713.750 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
23 | 714.375 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
24 | 715.000 | ${C_3}$ |
25 | 715.625 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
26 | 716.250 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
27 | 716.875 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
28 | 717.500 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
29 | 718.125 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
30 | 718.750 | ${C_0},\;{C_1},\;{C_3}$ |
31 | 719.375 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
32 | 720.000 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
33 | 720.625 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
34 | 721.250 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
36 | 722.500 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
37 | 723.125 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
38 | 723.750 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
39 | 724.375 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
40 | 725.000 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
42 | 726.250 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
43 | 726.875 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
44 | 727.500 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
45 | 728.125 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
46 | 728.750 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
48 | 730.000 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
49 | 730.625 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
50 | 731.250 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
51 | 731.875 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
52 | 732.500 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
54 | 733.750 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
55 | 734.375 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
56 | 735.000 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
57 | 735.625 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
59 | 736.875 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
60 | 737.500 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
61 | 738.125 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
62 | 738.750 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
63 | 739.375 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
65 | 740.625 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
66 | 741.250 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
67 | 741.875 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
68 | 742.500 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
69 | 743.125 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
71 | 744.375 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
72 | 745.000 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
73 | 745.625 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
74 | 746.250 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
75 | 746.875 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
77 | 748.125 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
78 | 748.750 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
79 | 749.375 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
80 | 750.000 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
83 | 751.875 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
84 | 752.500 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
85 | 753.125 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
86 | 753.750 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
88 | 755.000 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
89 | 755.625 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
90 | 756.250 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
91 | 756.875 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
92 | 757.500 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
94 | 758.750 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
95 | 759.375 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
96 | 760.000 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
97 | 760.625 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
100 | 762.500 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
101 | 763.125 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
102 | 763.750 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
103 | 764.375 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
106 | 766.250 | ${C_0},\;{C_1},\;{C_3}$ |
107 | 766.875 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
108 | 767.500 | ${C_0},\;{C_1},\;{C_2},\;{C_3}$ |
113 | 770.625 | ${C_0},\;{C_1},\;{C_3}$ |
Table3. The GIIRS temperature channel selection at 700?780 cm?1. A total of 89 comparable channels are in the list for retrieval validation. The first column represents the index value of the channel. The second column represents the wave number corresponding to the channel. In the third column, channel markers C0, C1, C2 and C3 indicate which central sequence was selected from.
Figure8. A comparison of the 23 temperature channels of CrIS, 49 temperature channels of IASI, and 106 temperature channels of GIIRS selected using our method in the long-wavelength band (700?1130 cm?1).
2
5.2. Retrieval of temperature profiles
The results of the retrieval mainly depend on the model architecture. Specifically, its performance is limited by the setting of the neural network hyperparameters, such as the structure of the network, the optimization algorithm, the number of input and output neurons, and the number of hidden layer units.Therefore, our neural network model only requires two user inputs. The first is the number of input neurons, which is derived from the different channel subsets, including those selected by SSA, the channel subset of IASI, and that of CrIS. We only used GIIRS temperature channels in the 700?780 cm?1 range (Fig. 8) as the network’s input, because there are no comparable temperature channels at 1000?1300 cm?1. The second model parameter to be set is the number of hidden layer neurons (Table 2), which was set based on the empirical formula used in previous studies. In short, the two candidate parameter sets were [89, 49, 23] and [55, 23, 12, 10] for the number of input neurons and hidden layer neurons, respectively. Therefore, a total of 12 models were used to assess retrieval performance. These models were optimally trained by validation checks to prevent overfitting and underfitting, thereby minimizing the model calculation time.
As shown in Fig. 9, the linear regression lines between all predicted and associated true values are close to the best-fit regression lines. Attention must be paid to the spines representing the distribution of the sample spaces, where the two highest peaks of the two curves correspond to each other. This means that for the 23 pressure levels selected, there are two pressure levels and the temperature of each is close to the temperature of the vicinity pressure level. In other words, an interesting piece of information can be obtained: we can judge whether the selected pressure level has a better representation of important changes in a complete temperature profile according to the overall distribution of the sample.
Figure9. Comparison of the retrieval performance of different temperature channel subsets under a different number of hidden layer neurons in the comparable band (700?780 cm?1). From left to right, the number of hidden layer neurons in each row’s subplot is 55, 23, 12 and 10, respectively. The number of input neurons is set to (a?d) 89, corresponding to the number of GIIRS temperature channels; (e?h) 49, corresponding to the number of temperature channels of IASI; and (i?l) 23, corresponding to the number of temperature channels of CrIS.
From the distribution of the testing samples, the more neurons in the input and the hidden layers, the closer the samples are to the regression line and the smaller the error between the predicted and true values. Therefore, Fig. 9a shows that the model performs best when the number of input and hidden layers is set as 89 and 55 respectively. Although the best model also produces certain anomalous predictions, these results do not affect the overall performance on the testing set.
Instead of using the overall testing set to evaluate the models, we analyzed the retrieval performance at each selected pressure level (Fig. 10). In general, increasing the number of input neurons or neurons in the hidden layer increases the retrieval accuracy at each pressure level. The reason is that a more complex neural network model can achieve better approximations. Specifically, for our temperature retrieval model, more input neurons means that more energy information of the brightness temperature spectrum can be learned, especially using the channel subset selected by the SSA method proposed as the input, when compared with the other two channel subsets. Likewise, more hidden layer units also enhances the learning ability of the model. Thus, the model with 89 input neurons and 55 hidden layer neurons provided the best performance at each pressure level. It should be mentioned that, although the performance of the selected channels has been verified through ANN models, the RMSE values are relatively high compared to community practice due to the limitations of matched samples and model architecture. These issues will be addressed in subsequent studies.
Figure10. Comparison of the retrieval performance of different temperature channel subsets at selected pressure levels under different number of neurons in the hidden layer. From left to right, the number of hidden layer neurons in each column of subplots is 55, 23, 12 and 10. In each subplot, the number of channels corresponding to the red, blue, and purple lines is 89, 49 and 23, respectively, which is the number of temperature channels of GIIRS, IASI, and CrIS in the comparable range (700?780 cm?1).
In each model, the retrieval result at 850 hPa is always the most accurate of all the levels. However, changes in model structure do not further significantly improve its accuracy. When the number of hidden layer nodes is 55, our channel subset slightly reduces the values of RMSE and MAE, but significantly improves the performance at pressure levels above 850 hPa, especially at 500 hPa. Wang and Wei (2012) noted that the accuracy of the 500 hPa geopotential height forecast is an important and classical measure of forecast skill.
Even though the accuracy of the temperature retrieval near 825 hPa is high, the R2 score at this level is much smaller than that of the other levels, indicating that there is little variation in the temperature values at this pressure level. Hence, a small prediction error will affect the R2 score. The R2 score at all other pressure levels is higher when using the best model.