334 Some new insights on interpreting TAF verification statistics

Tuesday, 25 January 2011
Charles K. Kluepfel, NOAA/NWS/OM, Silver Spring, MD

This is a study about how forecasters compare to statistical guidance products in forecasting instrument flight rule conditions (ceilings below 300 meters or visibilities below 4800 meters). All scheduled and amended terminal aerodrome forecasts (TAFs) were taken from NOAA's National Weather Service (NWS) for the period July 2009 to June 2010, and the guidance products used were the GFS LAMP, the GFS MOS, and the NAM MOS. The forecasters have access to all of these products prior to issuing their TAFs. Forecasts for temporary (TEMPO) conditions were ignored in this study.

At first glance, the respective threat scores and two-category Heidke skill scores for TAFs and GFS LAMP were very similar for 3-6 forecasts (no improvement over guidance); however, a different picture appears when the false alarm ratios and hit rates are examined. Forecasters have substantially reduced their false alarm ratios for IFR conditions by reducing the number of times they forecast these events and settling for lower hit rates than guidance. As a result, the 3-6 hour GFS LAMP false alarmed IFR conditions almost 3 times for every extra hit it got over the TAF, and similar trends were noted when the national data were sub-divided by region. Given the sensitivity of the aviation industry to false alarms of forecasts for low ceilings and low visibilities, this “trade off” appears to be well worth while.

The GFS LAMP was consistently the best performer of the three guidance products at all projections beyond 3 hours. In the 0-3 hour period, persistence outperformed all guidance products and the TAFs, except the TAFs had the lowest (best) false alarm ratios.

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner