228 Clustering of Multi-model Hurricane Sandy Track and Cyclone Phase Space Forecasts using a Regression Mixture Model

Monday, 11 January 2016
Alex Kowaleski, Pennsylvania State University, University Park, PA; and J. L. Evans


Global model skill varied substantially for medium-range track forecasts of hurricane Sandy (2012). European Centre for Mid-Range Forecasting (ECMWF) deterministic model and ensembles correctly forecasted the sharp left turn leading to landfall on the Mid-Atlantic coast up to a week before the event (Magnusson et al. 2013). Other models, including the United States' Global Forecast System (GFS) deterministic model and ensembles produced erroneous medium-range forecasts in which Sandy moved out to sea.

Understanding ensemble uncertainty in storm track and structure (CPS; Hart 2003) is critical to forecasting storm-related hazards associated with the extratropical transition of initially tropical cyclones.  In this study regression mixture modeling is used to cluster track and CPS forecasts of hurricane Sandy in an ensemble from four EPSs (ECMWF, GEFS, Canadian GEM, and the United Kingdom UKMO).

Among the four models studied, 117 forecasts (113 ensemble forecasts and four control forecasts) are issued every 12 hours. Seven-day forecasts initialized at 12-hour intervals from 00 UTC 23 October to 00 UTC 27 October are clustered using a path clustering regression mixture model (Gaffney et al. 2007; Camargo et al. 2007). To ensure a robust solution, clustering is repeated 500 times with different random initial membership weights and the set of clustering assignments with the highest likelihood is selected. For each forecast clustering is repeated for two through seven clusters and polynomial order two through five. Polynomial order is allowed to vary among forecasts; however, an optimal cluster number for all forecasts is sought.  Analysis of log-likelihood, Bayesian Information Criterion, average distance, and average squared distance suggests that four clusters is optimal.

Mean cluster paths and individual ensemble members within the clusters are analyzed to determine how the dominant cluster captures the track and landfall location of Sandy. Cluster membership among track and CPS clusters is analyzed to determine relationships between simulated track and structural evolution.

Track clustering partitions tracks between those that move out to sea and those that make landfall in varying locations along the United States east coast (Fig. 1).  The among-cluster spread shrinks with time as clusters converge toward landfall on the Mid-Atlantic.  An east-moving cluster is observed in all forecasts except 00 UTC 27 October; however, its population decreases with time. 

Even at long lead times, track clustering shows the threat to the Mid-Atlantic. The mean path of the dominant cluster makes landfall within 168 hours for all forecasts except 12 UTC 23 October. In most cases the landfall location of the dominant cluster is near the location of observed landfall. This is especially true for forecasts on and after 25 October, in which the dominant cluster landfall location error is less than 100 km.


Figure 1: Cluster mean tracks from each track cluster from 00 UTC 23 October through 00 UTC 27 October.  Dominant clusters are bolded.  Dots indicate mean cluster position at 00 UTC 30 October. The best-track is in black.

The divisions among CPS clusters vary substantially among forecasts; however, spread among CPS mean paths decreases somewhat with time as clusters show increasingly similar ET evolution (Fig. 2).  In many forecasts, CPS clustering successfully partitions forecasts of Sandy by the rapidity and completeness of ET and asymmetry during ET.


Figure 2: Cluster mean CPS paths for each CPS cluster from 00 UTC 23 October through 00 UTC 27 October.  Dominant clusters are bolded.  Dots indicate mean cluster position at 00 UTC 30 October. CPS paths are color-coded by mean track per CPS cluster.  Red is the farthest west, then blue, magenta, and green.

The Adjusted Rand Index (ARI; Hubert and Arabie 1985) between the track and CPS clustering results shows a relationship between track and CPS cluster membership.  In all forecasts, the ARI between track and CPS cluster assignments is above 0.10, indicating a relationship between track and CPS cluster memberships.  The ARI generally increases between 00 UTC 23 October and 00 UTC 25 October and generally decreases between 00 UTC 25 October and 00 UTC 27 October.

Analyses of cluster assignments for forecasts initialized at 00 UTC 25 October forecast reveal a somewhat complex relationship between track and CPS cluster membership (Fig. 3).  A single cluster is assigned the majority of ensemble forecasts (coincidentally, 61/117 in each case) for both the track- and the CPS-based partitions (dominant clusters are identified as blue for track and red for CPS in Fig. 3). Forty-seven of 61 members in the dominant track cluster are found in the dominant CPS cluster, but 9 of 12 members in the furthest west (red in Fig. 3) track cluster are also found in the red CPS cluster. Thus, while there is strong consistency between the track- and CPS-based clusters, track clustering produces a far-west cluster that is not isolated in CPS clustering (cf. Figs. 3 a and c).

 Substantial overlap is also evident between the blue and magenta track clusters and the magenta and green track and CPS clusters.  23/24 forecasts in the blue CPS cluster are in the magenta or blue track cluster. All 13 forecasts in the green track cluster are in the magenta or green CPS cluster, while 12 of 13 forecasts in the magenta CPS cluster are in the magenta or green track cluster.  The average tracks per CPS cluster and CPS paths per track cluster (Fig. 3 c-d) confirm the relationship between track and CPS cluster membership.

The results of this study illustrate the partitioning of multiple EPS forecasts of hurricane Sandy (2012) via mixture modeling of path clusters. Clustering of 7-day forecasts is determined using EPS hurricane tracks and also using CPS path forecasts. Substantial agreement between track-based and CPS-based cluster membership is found, consistent with the relationship between storm track and storm structure (e.g. Adem 1956; Holland and Evans 1992).


Figure 3: Cluster-mean trajectories for 00 UTC 25 October ensemble forecasts: storm tracks based on (a) track clustering and (c) CPS-based clustering; CPS paths based on (b) track clustering and (d) CPS-based clustering.

















- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner