My favorite model has improved but has it improved for me? A perspective on forecast evaluation and verification

Correia, James; Correia, James

The goal of NWP forecast evaluation and verification is to assess what forecast characteristics match well to observations. The hope is that metrics combined with user experiences can help guide forecast model improvement. Measuring that improvement over different time periods, say different years, is tricky because the events that occur may not be similar, occur at the same frequency, or the event types may change. This can be exacerbated if you consider the frequency of rare events, such as with extreme precipitation forecasts. For extreme precipitation forecasting, vaguely defined here as events greater than 1”, the events may be rare in terms of days per year and also in terms of aerial coverage and location. More events at smaller scales is a smaller target than fewer, larger events. Yet it is these larger events that may contribute more to forecast skill than the everyday, smaller events that forecasters have to contend with.

Here we look to perform an “idealized” experiment using observational MRMS 6 hour precipitation objects wherein we can assess object sizes at different thresholds and years, looking at how their size could contribute to standard contingency table metrics like Critical Success Index (i.e. given a perfect forecast, how do individual objects contribute to the CSI). We can then compare with model forecast object distributions to do the same. From there we can see how these events are distributed in time to understand how a score such as CSI is achieved relative to the sample size of days. In this way, forecasters can see just how important some events or event days are and make a connection between the scores and their perception of model quality. Ultimately we hope to provide perspective on notions of model quality via verification and their everyday use by forecasters.

13.6 My favorite model has improved but has it improved for me? A perspective on forecast evaluation and verification