Comparing NWS POP Forecasts to Third-Party Providers
TWC results are quite interesting and confirm the previous findings of Bickel and Kim (2008), which were based on a much smaller dataset. For example, the same-day forecast is relatively well calibrated for POPs of 0.3 and above, but does demonstrate a tendency to over forecast precipitation for POPs below 0.3. Calibration results for 1 to 4 day lead times are similar to the same day results. Calibration performance worsens significantly beyond 6 days. This poor performance is driven by TWC's desire to artifially avoid forecasts of 0.5.
CW forecasts POPs at a resolution of 0.01, instead of 0.1, and these forecasts do provide additional information. For example, when CW forecasts a POP of p + 0.01 it is more likely to precipitate than when they forecast p (p < 1). In addition, CW's forecasts tend to be well calibrated, but exhibt bias—especially their day-ahead forecast. CW's skill score is higher than TWC for forecasts ranging from 1 to 6 days.
The NWS's 1-day to 3-day forecasts are well calibrated. However, they also exhibit bias. For example, their 3-day forecast contains a bias of -0.06. Meaning that, on average, the NWS underestimates the POP by 0.06. The NWS day-ahead forecast outperforms CW and TWC. However, CW's 2-day and 3-day forecasts have a higher skill score than the NWS.
In sum, CW does appear to add value to the NWS POP forecasts, including an ability to forecast at the 0.01 level. TWC's near-term forecasts (less than seven days) exhibit less skill than either the NWS or CW. TWC's long-term forecasts (seven to nine days) exhibit either no skill or even negative skill.
Bickel, J. Eric and Seong-Dae Kim. 2008. Verification of The Weather Channel Probability of Precipitation Forecasts. Monthly Weather Review 136(12) 4867-4881.