Neighborhood-based evaluation of WRF-ARW precipitation forecasts for the 2010 NOAA Hazardous Weather Testbed Spring Experiment

Manning, Kevin W.; Manning, Kevin W.

For many years, researchers at NCAR have provided (Numerical Weather Prediction (NWP) products from the WRF-ARW model in support of the NOAA Hazardous Weather Testbed annual Spring Experiment. One goal of these experiments has been to evaluate the utility of high-resolution convection-resolving NWP for forecasting severe-weather events. For the 2010 Experiment, NCAR ran the WRF-ARW model twice each day (initialized at 00 UTC and 12 UTC), with forecast lead times out to 48 hours. The model grid covered much of the Continental United States, with a grid-cell size of 3 km. These numerical forecasts give us the opportunity to assess differences between 00 UTC and 12 UTC initializations, look at model behavior during various phases of the diurnal cycle, and evaluate the potential utility of high-resolution convective NWP out to 48 hours.

While the mode and evolution of convective features often matches observed storm behavior, the placement of the features is often only approximate at best. Even a small displacement of otherwise accurately-represented features will be strongly penalized by traditional skill scores (e.g., the Equitable Threat Score). Traditional objective measures of forecast skill can thus give misleadingly poor scores to subjectively good convective forecasts.

In recent years, researchers have begun to develop new tools to address this shortcoming in traditional scores. Some new techniques are object-based methods, which match and compare identifiable features (i.e., objects) between forecasts and observations, even though the objects may be displaced. We use here a simpler technique, which keeps the strategy of the traditional scores, but relaxes the criteria by giving a forecast credit for near-misses within the neighborhood of a forecast event. Using such neighborhood-based scores, as well as other methods, we examine the 2010 Spring Experiment WRF-ARW results with an eye toward documenting differences, at various spatial scales, between forecasts initialized at different times, differences in forecast skill during various parts of the convective cycle, and forecast skill in the 0-48 hour range.

Preliminary results indicate some intriguing differences in skill between 00 and 12 UTC initializations. Scores suggest that the forecasts initialized at 00 UTC generally show better skill than those initialized at 12 UTC. Other results show the diurnal cycle being handled similarly between forecast initialization times: a distinct pattern across a wide range of spatial scales can be seen from both the 00 UTC and 12 UTC initializations, showing greater skill during 13-19 Local Time (i.e., before convective initiation and early in the convective cycle), followed by a drop in skill after 19 Local Time (i.e., as convection strengthens and develops).

7B.5 Neighborhood-based evaluation of WRF-ARW precipitation forecasts for the 2010 NOAA Hazardous Weather Testbed Spring Experiment