Explaining the Sources of Uncertainty in Machine Learning Winter Precipitation-Type Predictions

Becker, Charlie; Becker, Charlie

Correctly forecasting the timing and location of changes in winter precipitation type could help decision makers mitigate the worst impacts of winter storms. Multiple precipitation type algorithms have been developed from both physical and statistical perspectives, but all of them struggle with correctly identifying sleet and freezing rain from rain and snow. What meteorological factors drive the repeatedly poor performance for sleet and freezing rain? In this project, we developed an evidential neural network that can predict both the probability of each winter precipitation type as well as the aleatoric and epistemic uncertainties associated with that probability. We trained our model on simulated soundings from NOAA Rapid Refresh model analyses over a multiyear period in the mid-to-late 2010s and paired the soundings with crowd-sourced precipitation type reports from the NOAA mPING dataset. In our model evaluation, we found that the evidential neural network displayed skill at predicting all p-types but still performed worse for sleet and freezing rain, which also both generally exhibited higher predicted uncertainty. We investigated the causes of the worse performance and higher uncertainties and found three main sources. First, the mPING data contains a significant number of reports not associated with typical sounding profiles for their reported p-type potentially due to either user error or adversarial reporting. Filtering these reports from the training data greatly improved the reliability and physical realism of the probabilities. Second, composite soundings for each predicted precipitation type ranked by predicted uncertainty show that uncertainty increases as the soundings drift closer to the freezing line and that freezing rain and sleet soundings differ in the height of their warm noses and depths of their freezing layers. Third, we employed XAI and interactive visualization techniques to explore the contributions of different sounding features to the p-type predictions. While surface temperature and dewpoint tended to be the most important feature on average, we found that the height of the most important meteorological features varied in the vicinity of frontal passages and the progression of the freezing line, illustrating the importance of case studies and meteorological analysis when attributing the sources of ML model predictions. Finally, we will show how the p-type algorithm performs when applied to multiple NWP models for cases in the Northeast US and analyze meteorological contributors to forecast successes and busts.

2B.3 Explaining the Sources of Uncertainty in Machine Learning Winter Precipitation-Type Predictions