The comparative verification was based on two cool-season samples of QPF and validation data for four US areas with diverse precipitation climatology, which comprised the service areas of the Ohio Valley RFC (OHRFC), Arkansas Basin RFC (ABRFC), California-Nevada RFC (CNRFC), and Northwest RFC (NWRFC). Specifically, the first sample was for October 1998 to March 1999 and OHRFC and ABRFC located in the eastern US were involved. Also, the QPFs for this sample were valid for 6-h periods within the 12 UTC to 12 UTC time frame and based on the 0000 UTC cycle (Day1 QPF). The second sample was for November 1999 to March 2000, the CNRFC and NWRFC located in the western US were involved, and the 6-h 0000-UTC based forecasts covered Day 1, Day 2, and Day 3.
A comprehensive suite of verification scores were computed based on the QPF and observed precipitation data partitioned into discrete and cumulative precipitation intervals, and with the precipitation data in continuous and categorical forms. However, it was found that the QPF performance rankings based on the various score types were roughly consistent, and that the mean absolute error (a simple error score) with the precipitation in continuous form provided a representative performance measure. In the computation of this and other scores for the continuous precipitation case, the partitioning of the data into discrete intervals was based on both the observed precipitation and QPF data, and both partitions were combined in the computation of a single score. This complete, concise approach to the scoring has not been previously documented.
The Day 1 verification results for the two eastern RFCs (first verification sample) showed that RFC and WFO forecasters performed slightly worse than the HPC and NCEP's AVN model QPF guidance. For the two western RFCs (second sample), the performance results over all three forecast days showed that WFO and HPC forecasters achieved roughly equal QPF skill. (Scores were not computed for the western RFCs because in this region RFCs have little involvement in QPF production.) With both samples, it was also found that the accuracy for all QPF products over the nation was low, and that the scores for the best human-generated products were only slightly better than those from the AVN model.