BNN models were built to forecast North American surface air temperature (SAT) and precipitation (PRCP) at 3, 6, 9 and 12 month lead. The predictors for the SAT forecast were generated from the extended empirical orthogonal function (EEOF) analysis of a combination of 3 variables: quasi-global sea surface temperature (SST), 500-mb geopotential height (Z500) over Northern Hemisphere and the SAT itself, with a weight factor of 1.5 applied on the SST field. The predictors for PRCP forecast were extracted from the EEOF analysis of a combination of the SST and Z500 fields with the SST double-weighted.
Cross-validation was used to evaluate the forecast skill, where the data of 2 consecutive years were reserved as testing or validation data, and the data over remaining years were used to train models. An ensemble average was also used: The training data were resampled by the bootstrap method and for each sample one BNN model was trained. The resampling and training were repeated 200 times and the averaged prediction by these 200 models was used as the final forecast.
For the SAT forecast, without correcting for the outliers, the skills of the BNN model were lower than the LR skills in most situations, esp. when using more hidden neurons in the BNN model. When the outliers were corrected, we found that the skills were generally improved, and in most situations were better than the LR models. At lead times of 3 and 6 months, the BNN-MD models and BNN-RD models gave comparable skills, while at longer lead time, BNN-RD models showed some advantage over the BNN-MD models. Hence, the nonlinear BNN model performs better than the linear model during interpolation but not during extrapolation, and correcting for the extrapolation data points leads to improved forecasts for nonlinear model.
For the PRCP forecast, the BNN model gave better skills than the LR model even without correcting for outliers. Correcting for outliers can further improve the skill.
Supplementary URL: