16.3 Sensitivity of Object-Based Verification Results to Configuration Options Using MODE

Thursday, 20 July 2023: 4:45 PM
Madison Ballroom B (Monona Terrace)
Jeffrey D. Duda, GSL, Boulder, CO; CIRES, Boulder, CO; and D. D. Turner

Handout (1.8 MB)

Object-based verification is achieving more widespread use, especially when applied to convection-allowing model (CAM) forecasts. Object-based verification is better suited for identifying features of interest in CAM forecast grids, such as individual thunderstorms (or storm complexes), updraft helicity tracks, or bands of precipitation and quantifying how well these features are represented compared to available observations. However, object-based verification tends to compress information on the model grid, and thus some information is lost in comparison, as opposed to traditional grid-to-grid verification in which information at every grid point is accounted for. Therefore, the two methods are complementary, not competitive. Recent work by Duda and Turner (2021) applied object-based verification to operational High-Resolution Rapid Refresh, version 3 (HRRRv3) forecasts; they are currently preparing a follow-up study using object-based techniques to objectively compare HRRRv3 and HRRRv4 forecasts. These verification studies use the Method of Object-based Diagnostic Evaluation (MODE), a tool from the Developmental Testbed Center’s Model Evaluation Toolkit software suite that is publicly available, and therefore can attain widespread use. A complication arises when using MODE, and any specific object-based verification method, however: tuning the configuration options. MODE, in particular, includes a few dozen user-configurable settings (numerical values) controlling how objects are identified and compared. Many of these settings lack a physical basis. These kinds of settings impact the interest value computation that provides a quantitative measure of the similarity between two objects in a pair (typically one from the forecast and one from the observations). The interest computation includes weights for object attribute comparisons such as area, intensity, and location. These weights are limited to only non-negative values; optimal values for these weights are unknown, and likely depend on the details of the field and attributes of interest to the user. Therefore, it is worth investigating the sensitivity of output metrics to the choice of weights. In this presentation we reveal the results from testing sensitivity of metrics used by Duda and Turner to attribute weight values. The test data comprise composite reflectivity objects from summer 2020 cases in HRRRv4. The methodology is summarized as follows. Vectors of size 10 (each representing a different attribute comparison weight value) were randomly created based on pre-defined experimental patterns. Initial work suggested that the greatest sensitivity is obtained when some of the weights were set to 0, which in effect removes a given object attribute comparison from being used to compute object pair interest. This detail can be very important to the verification metrics for targeted features of forecasts, e.g., if object shape and size were the only attributes that are considered important. Therefore, in many of the tests some weights were fixed at 0, although the specific attribute for which the weight was fixed was allowed to vary. Several hundred realizations were used for each pattern to quantify uncertainty in the sensitivity. Sensitivity among metrics is largest when most weight values are fixed at 0, but even when all 10 weights are non-zero, substantial sensitivity remains. Additionally, a principal component analysis was performed using these tests to isolate the magnitude of impact of each of the attribute values, which sheds additional light on the sensitivity. In general, the results cast uncertainty into the robustness of results obtained from object-based verification using MODE. Measures of uncertainty should be included in any comparison between modeling systems that use it, especially if the comparison involves tuning parameters that are not physically based.

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner