3B.2 Regional High Impact Hail Forecasting using Random Forests

Tuesday, 14 January 2020: 8:45 AM
156A (Boston Convention and Exhibition Center)
Amanda Burke, CAPS/University of Oklahoma, Norman, OK; and N. Snook and A. McGovern

Forecasts of severe hail produced using machine learning (ML) models over the contiguous United States (CONUS) have exhibited greater skill than those produced via traditional techniques. However, the environments responsible for hail formation differ regionally in the CONUS. To account for regional variations, local ML models produce severe hail forecasts . Random forests produce the regional forecasts using the High Resolution Ensemble Forecast system version 2 (HREFv2) as input. A diverse ensemble, the HREFv2 provides WRF-ARF and NMMB members as well as four time lagged members. The operational maximum expected size of hail (MESH), a multi-radar multi-sensor product, serves as verification of the ML severe hail forecasts.

Rather than limiting the training dataset, the random forests are trained on all hail-producing CONUS storms from 1 April to 31 July, 2017. Storms are weighed by their proximity to regional domains that experience greater climatological hail frequencies. For testing, only storms identified within the regional domains are examined. The weighted severe hail forecasts are compared to ML predictions produced without any weights, to determine if localized modeling results in superior forecasting performance.

Of the chosen regions, the difference between regionally-trained and CONUS-trained hail forecasts was greatest for the southern plains (trained around Dallas, Texas), in terms of objective, subjective, and statistical measures. Also, ranks from permutation variable importance, a model interpretation technique, indicate that low-level temperature and dewpoint are more important in the southern plains than across the CONUS. Preliminary analysis suggests the greater forecasting skill in the southern plains, compared to the lack of substantial improvement in other regions, results from a large number of severe hail events in the southern plains within the training period. Examining a larger dataset covering multiple years, and different weighting functions, could result in greater forecast performance in other regions.

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner