92nd American Meteorological Society Annual Meeting (January 22-26, 2012)

Wednesday, 25 January 2012: 11:00 AM
Objective Evaluation of Aviation Related Variables During 2010 Hazardous Weather Testbed (HWT) Spring Experiment
Room 335/336 (New Orleans Convention Center )
Lisa E. Coco, University of Northern Colorado, Greeley, CO; and T. L. Jensen, J. J. Levit, C. B. Entwistle, M. Harrold, S. J. Weiss, M. Xue, F. Kong, J. S. Kain, P. T. Marsh, A. J. Clark, M. C. Coniglio, and B. G. Brown

The 2010 Hazardous Weather Testbed (HWT) Spring Experiment ran from May 17th through June 18th to allow researchers and forecasters to test and evaluate cutting edge products for forecasting severe convective weather. The 2010 Spring Experiment was unique in that it was the first year HWT encompassed aviation related variables into the evaluation.

Reflectivity (REFC) and radar echo top (RETOP) values were looked at spatially and traditionally in relation to aviation needs. REFC values >=30 dBZ as well as RETOP values of >=25kFT, >=30kFT, and >=40kFT were evaluated and compared. The Method for Object-Based Diagnostic Evaluation (MODE), a tool in the DTC's Model Evaluation Tools (MET) software package, defines objects diagnosed based on the convolution radius and threshold set by users and transforms raw data into areas of interest with associated attributes. By utilizing MODE for evaluating RETOP and REFC values a forecast can be validated through various object-oriented metrics.

Traditional statistics such as Critical Success Index (CSI – Threat Score), Gilbert Skill Score (GSS -Equitable Threat Score), and Frequency Bias were compared to MODE attributes such as Centroid Distance, Boundary Distance, Total Interest, and Symmetric Difference to see how object-oriented and categorical statistics differ. Since traditional statistics are statistics for dichotomous variables calculated from data output from a two-by-two contingency table and MODE matches and merges areas of interest that your eye would normally pick out, results of how the models performed varied.

The goal of the Developmental Testbed Center (DTC) throughout the Spring Experiment collaboration was to evaluate single-value and probabilistic output from various models such as the 4 km CAPS Storm-Scale Ensemble (all 26 members plus products), the CAPS 1km deterministic, the SREF Ensemble Products (32-35km), the NAM 12km, and the NOAA/ESRL High Resolution Rapid Refresh (HRRR) which is a 3km. Three models evaluated by DTC, including HRRR, CAPS Ensemble (Probability Matched) Mean, and CAPS 1km,  contained RETOP as a variable the focus for the aviation related variables. The comparison between both deterministic models and ensemble products was made by looking at various types of convective events to see how each handled the forecast.

Subjectively, the HRRR and CAPS 1km appear to perform better for REFC and RETOP products. Objectively, all models appear to over predict RETOP areal coverage by at least a factor of 2-5 based on FBIAS. The HRRR and CAPS 1km skill for RETOP forecasts are similar and thus 1km computational expense may not be worth considering at this time. Finally, the Probability Matching post-processing technique seems to inflate the over-prediction of areal extent of cloud shield to a non-useful level for forecasting.

This talk will present example cases along with results aggregated over the entire experiment with emphasis on implications for aviation decision support tools.

Supplementary URL: