3A.2 Probabilistic Locally Extreme Precipitation Forecasting with Machine Learning

Friday, 28 July 2017: 1:45 PM
Constellation E (Hyatt Regency Baltimore)
Gregory R. Herman, Colorado State Univ., Fort Collins, CO; and R. S. Schumacher

Machine learning algorithms are implemented to develop skillful, calibrated contiguous United States (CONUS) wide probabilistic forecasts for extreme precipitation, framed in the context of average recurrence interval (ARI) exceedances; Specifically, forecasts are produced for 24 hour precipitation accumulations for 1 and 10-year ARI exceedances at two different forecast lead times: 36-60 hours and 60-84 hours. CONUS is partitioned into eight regions which exhibit similar hydrometeorological properties. Within each of these regions, forecasts are produced for each forecast point on a coarse (~0.5°) grid, each day in the 11-year historical record spanning 2003-2013, and for each of the two forecast intervals. Predictor data used to generate exceedance probabilities come from an assortment of simulated atmospheric fields taken from a record of NOAA’s Second Generation Global Ensemble Forecast System Reforecast (GEFS/R). For each field used, model forecast data is taken relative to each forecast point in space, and for each output time step over the given 24-hour forecast interval. This produces a large number of candidate GEFS/R predictors of extreme quantitative precipitation (QPF); to yield more tractable analysis and alleviate concerns of overfitting, this rearranged record of historical model data is pre-processed with principal component analysis (PCA) and the primary modes of atmospheric variability within a region of CONUS identified. The dimension-reduced GEFS/R model data is then supplied to a random forest (RF) algorithm to produce 1-year or 10-year ARI exceedance probabilities within ~40 km of a point. Numerous sensitivity experiments are performed to diagnose the impact of various components of forecast information from the dynamical model ensemble on the forecast skill of the RF model. These results and final results of applying this methodology will be presented. Overall, the results were extremely positive; the probabilistic forecasts were both much more skillful and more reliable, as discerned respectively by rank probability skill score calculations and analysis of reliability diagrams from the 11-year period of record, than probabilistic forecasts generated from the raw ensemble QPFs. Ultimately, substantially more than one day of forecast skill was added by applying the machine learning methodology.
- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner