Bayesian Verification of Warnings
Consider a system that issues warnings of intermittent hazards (e.g., severe thunderstorm warnings, tornado warnings), or deterministic forecasts of intermittent events (e.g., heavy precipitation, strong wind, blizzard). The performance of such a warning system is usually evaluated in terms of a probability of detection (POD) and a false alarm rate (FAR). Oftentimes, the two measures are aggregated into a critical success index (CSI). These verification measures are examined from the viewpoint of a rational decider who, following the precepts of Bayesian theory, uses warning to choose optimal action by minimizing the expected disutility of outcome. In the long-run, the performance of this warning-decision system is evaluated in terms of the integrated minimum expected disutility (D). Then, given two warning systems, A and B, system A is said to be more informative than system B, if DA ≤ DB for every disutility function of outcome (equivalently, for every rational decider).
A proof is presented that, while POD and FAR are sufficient verification measures (in the sense that PODA ≥ PODB and FARA ≤ FARB imply that A is more informative than B), the CSI is not. In fact, the CSI is a misleading measure because there exist cases (theoretically an infinite number of them) with CSIA > CSIB (implying A is better than B), but DA > DB (implying A results in a greater integrated disutility than B does, and, therefore, A is less preferred than B from the viewpoint of a rational decider).
A new aggregate measure, called the characteristic utility score (CUS) is derived. With respect to the informativeness relation, CUS is a necessary measure (A being more informative than B implies CUSA ≥ CUSB), and a conditionally sufficient measure (CUSA ≥ CUSB implies that A is more informative than B for a class of deciders with a particular ratio of disutility differences). In conclusion, if management wants to provide an incentive for improving the warning system in a way that increases its socio-economic benefits to users, than the correct verification measures are POD, FAR, and CUS.