With limited members, the ensemble may not guarantee a perfect Gaussian probability distribution, but each ensemble member can be regarded as a sample from the approximate Gaussian probability distribution, including those close to zero probability when a sample is outside of the distribution range (Vukicevic, et al., 2008).Ideally, the samples should reflect the true state (observation), but in reality, samples do not represent the true state. Some samples belong to the frequency of outliers and small probability events. Sometimes, these samples are far away from verified observations. Such impossible or outlier samples are meaningful for statistical analysis, but may downgrade the accuracy of ensemble forecasts. One means of solving this problem is to transform the probability density function (PDF) for the truth to a PDF for observations to be verified (Saetra, et al., 2004).In this study, observation data were used to restrict the probability distribution of samples. This method differs from data assimilation, and can be called an observation-based ensemble technique. In general, equal weights are calculated by a simple arithmetic average, whereas unequal weights are determined using more complex techniques. The same principle can be considered for optimal ensembles, and it has been applied successfully in forecasting both warm-season meso-scale convective systems (Jankovet al. 2007a) and cold-season topographical forcing (Jankovet al. 2007b).

Track forecasts and maximum surface wind speed forecasts could be used to quantitatively assess risk and make more appropriate and earlier decisions for coastal evacuations (Hamill, et al., 2011). Usually, TCs occur far from the land, and therefore data assimilation is not very satisfactory. During the past 20 years, track forecast errors from day 1 to 5 in 2013 have been reduced by more than 50% (Cangialosi and Franklin, 2014). Nevertheless, improvements in NWP are still required to decrease the track of TCs. Sample optimization is also known as observation-based ensemble subsetting, and it uses data from previous works (Lee and Wong, 2002; Yamaguchi et al., 2012; Qi et al., 2014; Dong and Zhang, 2016). Qi et al. (2014) first proposed a method of obtaining an ensemble mean track forecast by choosing members weighted as a function of their short-term track forecasted error. Dong and Zhang (2016) applied this method to a single ensemble through modification in the subset selection and expansion using a super ensemble. The China Meteorological Administration discovered that for a given forecast ensemble, some samples give small track forecast errors (“good” samples), but some give poor forecast tracks (“bad” samples) that deviate substantially from the truth. Therefore, good members should be objectively identified. In this study, samples close to expected values (observation) are identified as “good” ones, and those with small probability are regarded as “bad” ones, which are replaced by good ones to reflect the true state as much as possible. To determine the effect of sample optimization on ensemble forecast, some studies, such as Dong and Zhang (2016) used the best-track position of hurricanes that occurred during 2012-13. However, the best tracks do not update the latest tropical cyclones.

It is noteworthy that the use of ensemble predictions derived from operational weather forecasting models will identify the expected spread of weather conditions and assess the probability of particular weather events. Because observation data were used to restrict the probability distribution, sample optimization may affect ensemble spread. In general, the use of ensemble spread as a predictor of ensemble mean skill has been investigated in many studies. Houtekamer (1993) showed that the spread has the most predictive value when it is “extreme”, that is, when it is very large or small compared to its mean value. Whitaker (1998) pointed out that the more the spread departs from its climatological mean value, the more useful it is as a predictor of skill. When the spread is close to the climatological mean value, it has very little predictive value, because the forecast error becomes essentially a random draw from the climatological distribution. Grimit and Mass (2007) used a statistical model based on an ensemble prediction system for perfect forecast assumption. In their system, the underlying probability distribution function of the forecast error is known, and individual ensemble members represent random draws from this distribution, with the ensemble spread providing a measure of the expected forecast error. Generally, a larger (smaller) ensemble dispersion implies more (less) uncertainty in the forecast (Hopson, 2014).However, an excessive amount of bad samples being replaced by good ones may lead to an extremely small ensemble spread, which will produce inaccurate ensemble means. In contrast, insufficient replacement of bad samples may prohibit evident adjustment and upgrade of ensemble mean. Therefore, the number of samples to be replaced should be carefully selected in such a manner that the ensemble mean is upgraded, but the ensemble spread is not excessively downgraded.