J3.2 Improving Bayesian neural network predictions of N. American seasonal climate by correcting for extrapolations (2008

19th Conference on Probability and Statistics
Sixth Conference on Artificial Intelligence Applications to Environmental Science

J3.2

Improving Bayesian neural network predictions of N. American seasonal climate by correcting for extrapolations

Aiming Wu, University of British Columbia, Vancouver, BC, Canada; and W. W. Hsieh, A. J. Cannon, and A. Shabbar

In Bayesian neural network (BNN) models, the regularization parameter(s) are automatically determined to prevent overfitting. However, such nonlinear BNN models can still underperform linear regression (LR) models during forecast cross-validation. The suspects are the test data where the predictors are outliers relative to the training data used in building the models, as the nonlinear model can do wild extrapolations with outlying predictors, thereby decreasing the forecast skills. Thus if a test datum is considered an outlier, we simply replace the nonlinear model forecast by the LR forecast. Methods based on the Mahalanobis distance (MD) and the robust distance (RD) are used to detect “outliers”, at the 75%, 90%, 95% and 99% quantiles, with these models referred to as BNN-MD and BNN-RD, as they replace the BNN forecast with the LR forecast for the “outlier” points.

BNN models were built to forecast North American surface air temperature (SAT) and precipitation (PRCP) at 3, 6, 9 and 12 month lead. The predictors for the SAT forecast were generated from the extended empirical orthogonal function (EEOF) analysis of a combination of 3 variables: quasi-global sea surface temperature (SST), 500-mb geopotential height (Z500) over Northern Hemisphere and the SAT itself, with a weight factor of 1.5 applied on the SST field. The predictors for PRCP forecast were extracted from the EEOF analysis of a combination of the SST and Z500 fields with the SST double-weighted.

Cross-validation was used to evaluate the forecast skill, where the data of 2 consecutive years were reserved as testing or validation data, and the data over remaining years were used to train models. An ensemble average was also used: The training data were resampled by the bootstrap method and for each sample one BNN model was trained. The resampling and training were repeated 200 times and the averaged prediction by these 200 models was used as the final forecast.

For the SAT forecast, without correcting for the outliers, the skills of the BNN model were lower than the LR skills in most situations, esp. when using more hidden neurons in the BNN model. When the outliers were corrected, we found that the skills were generally improved, and in most situations were better than the LR models. At lead times of 3 and 6 months, the BNN-MD models and BNN-RD models gave comparable skills, while at longer lead time, BNN-RD models showed some advantage over the BNN-MD models. Hence, the nonlinear BNN model performs better than the linear model during interpolation but not during extrapolation, and correcting for the extrapolation data points leads to improved forecasts for nonlinear model.

For the PRCP forecast, the BNN model gave better skills than the LR model even without correcting for outliers. Correcting for outliers can further improve the skill.

Recorded presentation

Joint Session 3, Bridging the Gap between Artificial Intelligence and Statistics in Applications to Environmental Science-I
Wednesday, 23 January 2008, 8:30 AM-10:00 AM, 219

Previous paper Next paper

Browse or search entire meeting

AMS Home Page