In this study, we quantitatively document the scale-sensitivities in precipitation skill scores for four numerical model formulations run during IHOP. The model comparison includes the operational 12-km Eta, the operational 20-km RUC, an experimental 10-km RUC and an experimental 12-km LAPS/MM5. Comparisons of the equitable threat score (ETS) and bias are made for each of the models (verified against stage IV precipitation data) on their native grid and on systematically coarsened grids. By systematically upscaling higher-resolution forecasts to coarser grids, we are able to isolate the impact on the skill scores due solely to smoothing the forecast and verification fields. This comparison of traditional skill scores is complemented by spectral analyses of the various forecast and verification fields. In the first set of experiments, both the forecast and verification fields are upscaled, allowing us to assess the scale impacts on comparisons of forecasts with significantly different spectra. In the second set of experiments, only the forecast fields are smoothed, allowing us to evaluate the usefulness of enhanced precipitation detail, as reflected in the traditional skill scores.
The focus of work so far is on the first set of experiments, in which both the forecast and verification precipitation fields are systematically coarsened. For theses experiments, we document the skill-score dependence on the spectral characteristics and bias of the precipitation field, thereby confirming the significant scale impacts on precipitation skill scores. Work continues on the second set of experiments, in which only the forecasts fields are smoothed. Overall, our results support earlier research suggesting that it may be difficult to show improvement in ETSs for models with increasingly fine resolution.
Supplementary URL: http://ruc.fsl.noaa.gov