The turbulence algorithm intercomparison exercise: statistical verification results

Brown, Barbara G.; Brown, Barbara G.

During the winter of 1998-99, forecasts of clear-air turbulence produced by a large number of turbulence indices were compared. The 14 algorithms considered in the study include a number of algorithms that have been available for many years, as well as algorithms that are newly under development. The algorithm forecasts were based on output of the RUC-2 numerical weather prediction model for the period 21 December 1998 to 31 March 1999. Forecasts issued at 1200, 1500, and 1800 UTC, with 3-, 6-, and 9-hr lead times were included in the study. Turbulence AIRMETs, the operational turbulence forecast product that is issued by the NWS's Aviation Weather Center (AWC), also were included in the evaluation. The evaluation was limited to the continental United States and to altitudes above 20,000 ft.

The forecasts were verified using Yes and No turbulence observations from pilot reports (PIREPs), as well as No observations based on automated vertical accelerometer (AVAR) data that were obtained from a number of aircraft. The algorithms were evaluated as Yes/No turbulence forecasts by applying a threshold to convert the output of each algorithm to a Yes or No value. A variety of thresholds was applied to each algorithm. The verification analyses were primarily based on the algorithms' ability to discriminate between Yes and No observations, as well as the extent of their coverage.

The study was comprised of two components. First, the algorithms were evaluated in near-real-time by the Real-Time Verification System (RTVS) of the NOAA Forecast Systems Laboratory (FSL), with results displayed on the World-Wide Web (http://www-ad.fsl.noaa.gov/afra/rtvs/RTVS-project_des.html). Second, the verification results were re-evaluated in depth in post-analysis, using a post-analysis verification system at the National Center for Atmospheric Research (NCAR), with additional thresholds applied to each algorithm to provide a thorough depiction of algorithm quality.

Results of the intercomparison suggest that some algorithms perform somewhat better than others. In particular, these algorithms have somewhat larger values of the True Skill Statistic for comparable thresholds, and they have a slightly larger overall discrimination skill statistic. However, the best algorithms have very similar performance characteristics. In some (but not all) cases the algorithm performance is slightly better than the performance of the AIRMETs. Results of the study also suggest that further algorithm development is needed before newer algorithms will show large improvements over some of the older algorithms. Moreover, algorithms like Integrated Turbulence Forecasting Algorithm (ITFA) may benefit by not including some algorithms that have relatively little forecasting skill.

A second algorithm intercomparison is in progress for winter 1999-2000. Available results of this second evaluation will also be reported and compared to the winter 1998-1999 results.

7.4 The turbulence algorithm intercomparison exercise: statistical verification results