Evaluation of Experimental Forecasts from the NOAA Hazardous Weather Testbed Spring Experiment

Jensen, Tara L.; Jensen, Tara L.

The Experimental Forecast Program (EFP) component of the NOAA Hazardous Weather Testbed (HWT) has conducted Spring Experiments since 2000. The main focus of recent Spring Experiments has been to gain an understanding of how to better use the output of near-cloud resolving configurations of numerical models to predict convective storms. The primary organizers of the HWT-EFP are the National Severe Storms Laboratory (NSSL) and the Storm Prediction Center (SPC). The experiences of the HWT-EFP participants have shown that the high resolution convective storm predictions are at times difficult for operational forecasters to reconcile, in part because many solutions appear to be plausible for a given mesoscale environment.

The Model Evaluation Tools (MET), developed by the Development Testbed Center (DTC), was used in 2008 and will be used in 2009 to help evaluate WRF model performance for the Spring Experiment. Three important goals of these evaluations have been (i) to provide objective evaluations of the experimental forecasts, ii) to supplement and compare to subjective assessments of performance; and (iii) to expose the forecasters and researchers to both new and traditional approaches for evaluating precipitation forecasts.

MET provides a variety of statistical tools for evaluating model-based forecasts using both gridded and point observations. WRF model forecasts of 1-h accumulated precipitation were evaluated using the Grid_stat and MODE tools within MET. Grid_stat applies traditional verification methods for gridded datasets. These methods include verification metrics such as the Equitable Threat Score (ETS), Bias, and a host of other statistics. MODE, the Method for Object-based Diagnostic Evaluation, provides an object-based verification of gridded forecasts by identifying and matching "objects" (i.e. areas of interest) in the forecast and observed fields and comparing the attributes of the forecast/observation object pairs.

The DTC evaluated thirty-three cases from NCEP's Environmental Modeling Center (EMC) and NSSL 4-km WRF runs (NMM and ARW dynamic cores, respectively) during the 2008 Spring Experiment. Lead times from 0-36 hours were included in the evaluation matrix. In general, the EMC and NSSL models verified similarly when using ETS as the indicator. ETS values ranged from 0 to 0.5, with generally higher scores for lead times of 12-24 hour. The NSSL forecasts had larger bias values for lower precipitation thresholds, while the EMC forecasts were characterized by larger biases for higher precipitation thresholds. Output from MODE analyses suggest that the NSSL 4-km model had the most skill in forecasting the larger objects (with a scale of approximate 40 km) while the EMC 4-km model had improved skill when considering smaller objects (approximately 20-km in scale).

The 2009 Spring Experiment runs from early May until early June. Results from both the 2008 and 2009 Spring Experiments will be discussed in this presentation.

15B.4 Evaluation of Experimental Forecasts from the NOAA Hazardous Weather Testbed Spring Experiment