16th Conference on Probability and Statistics in the Atmospheric Sciences

1.2

Transformed skill scores

Tressa L. Fowler, NCAR, Boulder, CO; and R. G. Bullock and B. G. Brown

Skill scores are commonly used in forecast verification as a means of comparing forecasts. Additionally, skill scores are similar to Goodness-of-Fit (GOF) tests. Observed multinomial data can be compared to a model via the GOF tests. Skill scores compare categorical forecasts to baseline forecasts such as chance, persistence, climatology, or the current operational standard.

A generalized form for GOF tests has been derived. This generalized form, called the power divergence family of statistics, requires a settable power parameter. The parameter is used in the constant term, but most importantly is the exponent of the observed counts. Research suggests that the generalized form of the test works well in a variety of situations when the parameter (i.e. the exponent) is set to 2/3. For instance, the test maintains a balance between determining lack of fit yet remaining robust to a single cell departure. The test with this parameter also works in situations of sparse cell counts or finite populations. Additionally, the moments very closely match the asymptotically derived Chi-square moments.

Since skill scores and GOF tests have much in common, skill scores may benefit from a similar treatment. As illustrated by metrics such as L2 and L1, the exponent can have quite a large effect on a measure. In this paper, the effect of exponentiation on skill scores derived from the 2x2 contingency table is investigated for several sets forecasts. The effects of overforecasting, underforecasting, and rare events on the transformed skill scores are examined and compared to their effects on the original skill scores. In particular, the equitability of the transformed scores is assessed. Exponentiation may change some of the characteristics of skill scores. However, as the transformed scores are still based on the same counts from the contingency table, they may not differ fundamentally from their original form.

extended abstract  Extended Abstract (280K)

Session 1, forecast evaluation
Monday, 14 January 2002, 9:30 AM-3:30 PM

Previous paper  Next paper

Browse or search entire meeting

AMS Home Page