Testing Random Forests for Prediction of Excessive Rainfall Based on the High-Resolution Rapid Refresh (HRRR)

James, Eric; James, Eric

Machine learning based on numerical weather prediction (NWP) shows promise for prediction of severe weather events, although questions remain regarding the approach required when using convection-allowing models (CAMs) versus coarse, convection-parameterizing NWP models for the day-one period (12 UTC - 12 UTC). In this study, we evaluate a number of important questions surrounding the development of a random forest (RF) model for excessive rainfall prediction using daily 00 UTC initializations from a CAM: the High-Resolution Rapid Refresh (HRRR). We demonstrate a small positive forecast skill impact of extending the training period length from two to three years, as well as a small negative impact of a HRRR version mismatch (HRRRv3/HRRRv4) between the training data and forecasting data. In addition, in an effort to filter out noise related to convective-scale predictors, we show results of predictor assembly experiments exploring the use of spatial and temporal search radii in the construction of training matrices. Spatial aggregation of predictors provides forecast benefit in most parts of the CONUS, while use of hourly predictors (versus 3-hourly) leads to improved performance in the southwestern CONUS, reflecting the shorter duration of excessive rainfall events in that region. Using a time-lagged ensemble of three daily HRRR forecasts provides forecast benefit in certain situations. Finally, we explore the forecast benefit associated with using observed flash flood guidance exceedances for training. These results suggest the potential for CAM-based RFs to play an important role in the day-one forecasting process for excessive rainfall.

16.3 Testing Random Forests for Prediction of Excessive Rainfall Based on the High-Resolution Rapid Refresh (HRRR)