Friday, 16 August 2002: 11:30 AM
How Traditional NWS Verification Encourages "Hedging", and a Possible Remedy
A verification "score" measures the accuracy of a given forecast method. Measuring the "value" of the method, though, requires a comparison of scores with other methods. The way in which the comparison is made must be as carefully constructed as the verification score itself. With temperature, for example, human MAE can be compared with MOS MAE to make inferences about the value of human forecasts. In the traditional method used by the NWS, the difference between respective MAEs measures the value. But examples will be shown how, using this method, a forecast may be "hedged" to improve its perceived "value" versus MOS, while at the same time lowering (i.e., worsening) its own verification score. This article discusses a comparison method that does not suffer from such a problem, and contrasts it with the traditional method.