Validation of a Multi-Model Ensemble For Tropical Cyclone Forecasts Over the Western North Atlantic
Nicholas Leonardo and Brian A. Colle
1 School of Marine and Atmospheric Sciences, Stony Brook University, Stony Brook, NY
The annual-averaged track forecasts for tropical cyclones have improved during the past two decades, but there are still events with relatively large track errors (e.g., Joaquin 2015). Operational ensembles from various organizations have allowed for more probabilistic forecasts of tropical cyclone track and intensity. However, the verification of operational multi-model ensembles has been limited, with studies only focusing on one ensemble system, one particular storm, or just 1-2 seasons. Hence, our study assesses all available ensemble systems, in terms of their collective and individual performance for various deterministic and probabilistic metrics during the past eight years (2008-2015).
The track forecasts for the 2008-2015 North Atlantic seasons are evaluated, focusing on lead times of 3-5 days. Three ensemble systems are verified, ECMWF (51 members) the UKMET (23 members), and the GEFS (21 members), as well as all three ensembles combined (Grand ensemble). The cyclone tracks from these models are archived by the THORPEX Interactive Grand Global Ensemble (TIGGE) database and were downloaded through NCAR. Several other deterministic models, such as the GFDL, HWRF, and the NHC’s official forecast, are also evaluated using the track forecasts archived by NHC. All models are verified against the NHC best track data. The probabilistic skill of the ensembles is then quantified with the Brier Skill Score, using either the ECMWF deterministic forecast (“EC_det”) or the NHC official forecast with climatology-based cone (“OFCL”) as the reference. Rank histograms of the along and cross-track errors of each ensemble member are used to inspect the Grand Ensemble’s dispersion.
The annual-averaged performance of each modeling system has a large interannual variability, although there is decrease in mean total track errors from 2010 to 2012. A negative (“slow”) along-track bias decreases in magnitude throughout this period, which is associated with extratropical transitions. The mean absolute errors of the Grand ensemble are smaller than each of the individual ensembles, but the EC_det has smaller mean track errors than the Grand Ensemble mean. Both the ECMWF ensemble and Grand Ensemble have the largest probabilistic skill compared to the EC_det and similar skill to the NHC cone probabilities made using the ensemble data the forecasters were able to peruse. The Grand Ensemble appears to be underdispersed in the along-track direction given the slow bias on average. Meanwhile, there is some overdispersion in the cross-track direction. The Grand ensemble struggles the greatest at day 5 with recurving storms off the East Coast, with the initial points of these forecast tracks clustered over and east of Puerto Rico.