3.1 NOAA's Second-Generation Multi-Decadal Ensemble Reforecast Data Set and Applications to Hydrology

Wednesday, 9 January 2013: 10:30 AM
Room 18A (Austin Convention Center)
Thomas M. Hamill, NOAA, Boulder, CO; and J. S. Whitaker

For the statistical post-processing of weather forecasts, artificial intelligence / machine learning (AI/ML) techniques may provide particular advantages over simpler techniques (e.g., multiple linear regression) in situations where large amounts of training data are available. In such situations, the AI/ML may be able to find relationships in the data that would be difficult to find or more statistically questionable with limited training data. Of course, it thus helps to have very large training data sets.

In this talk we will describe a new multi-decadal global ensemble reforecast data set that we believe may be of great interest to the AI/ML community. Every day from late 1984 to present, 11-member global reforecasts were computed using the current (2012) operational version of the NOAA Global Ensemble Forecast System (GEFS). Forecasts extend to 16 days lead. As with the operational model, the forecasts were computed at T254L42 (about ½ degree grid spacing) in week 1 and T190L42 (about ¾-degree) in week 2. Data was archived every 3 h to 72 h, and every 6 h thereafter. 99 fields are freely available for fast download from NOAA/ESRL/PSD. A full data archive is maintained at the Department of Energy, where the data set was created. Prior to 2012, the Climate Forecast System Reanalysis supplied the control initial condition. Perturbed initial conditions were generated with the operational ensemble transform with rescaling technique.

To facilitate researchers and students to quickly use this data set, we have extracted some smaller subsets of this (gigantic, ~150 TB) reforecast data set and are making this data available to the community in netCDF format.

This talk will review the reforecast data set, the procedures for accessing the data, and demonstrate some simple applications of the data set to post-processing.

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner