15.6 Making Sense of Random Forest-Based Severe Weather Forecasts Using Tree Interpreter

Thursday, 20 July 2023: 3:15 PM
Madison Ballroom A (Monona Terrace)
Alexandra Mazurek, Colorado State Univ., Fort Collins, CO; and A. J. Hill and R. S. Schumacher

Despite their rapidly increased use in weather forecasting, the perceived lack of transparency with machine learning (ML) continues to be a regularly-cited impediment for greater use in operations. While quantitative and qualitative verification and testbeds have been useful for building forecaster trust in experimental ML models, these assessments are limited in that they are primarily retrospective. Thus, as ML-based tools are used more frequently in operational meteorology, there is an increased need for explainability methods that can help forecasters make sense of ML predictions in real-time.

One tool that has been helpful with dissecting ML model output (specifically those that rely on random forest methods) is tree interpreter (TI). TI is a package that works in tandem with Python’s scikit-learn library and allows for disaggregation of random forest-based probabilities by feature, which can offer insight on how individual model inputs can influence a final prediction. This method provides some important benefits beyond existing methods that only consider feature importances in an aggregate sense. In this work, TI is used to examine probabilistic forecasts for severe hazards (tornadoes, wind, and hail) from the Colorado State University Machine Learning Probabilities (CSU-MLP) system. Contributions from 15 Global Ensemble Forecast System (GEFS) environmental fields (which serve as model predictors) are analyzed in time and space for approximately two years of daily forecasts. While CSU-MLP produces probabilistic severe forecasts out to Day 8, contributions corresponding to forecasts for Days 2-4 will be emphasized. Results will cover diurnal, seasonal, and spatial trends that were identified amongst the contributions. This presentation will conclude with a discussion on the potential utility of viewing TI contributions alongside CSU-MLP severe probabilities in operational forecasting settings, as well as plans for developing forecaster-friendly visualizations for these data.

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner