As Harvey approached landfall, uncertainty in its track contributed to uncertainty in forecasts of rainfall totals and distributions. Operational forecast ensembles generated one day before landfall showed substantial track spread due to weak steering currents. For example, in the European Centre for Medium-Range Weather Forecasts (ECMWF) ensemble initialized at 0000 UTC 25 August, the largest group of members forecast Harvey to re-emerge over the Gulf of Mexico and move toward extreme east Texas or Louisiana. However, substantial minorities of ensemble members forecast Harvey to move slowly near the Texas coast or turn southward toward the Rio Grande Valley.
For extreme events such as Harvey’s rainfall, forecasting and preparation require an understanding of the range of possible scenarios. Path clustering provides a method for distilling a large number of forecast tracks into a small number of groups based on storm motion. Thus, clustering has the potential to bridge the gap between deterministic and probabilistic track forecasts by acknowledging the temporal evolution of forecast uncertainty and presenting a small number of representative outcomes. However, in many tropical cyclones, including Harvey, hazardous conditions occur well beyond the track of the storm center; therefore, it is critical to relate the track forecast clusters to the hazards expected from each cluster.
Track clustering is explored here as a method to understand distinct forecast scenarios of Harvey’s rainfall from an ensemble of 51 convection-permitting Weather Research and Forecasting (WRF) simulations. Initial and boundary conditions for each simulation are obtained from ECMWF ensemble forecasts initialized at 0000 UTC 25 August (27 hours before landfall). WRF simulations are initialized at 0000 UTC 25 August and run for 168 hours, terminating at 0000 UTC 1 September.
Tracks of Harvey derived from the WRF ensemble are partitioned into clusters using regression mixture modeling. To determine the optimal polynomial order and number of clusters, Bayesian Information Criterion is computed for all candidate partitions. Values of average position displacement relative to the cluster mean are also examined.
After optimal cluster parameters are selected, the distribution of rainfall hazard by cluster is examined. The ability of track clustering to produce distinct hazard scenarios will be investigated using the temporal and spatial distribution of the accumulated rainfall field; rainfall at select locations (e.g. Houston, Beaumont, Lake Charles) will also be examined. Inter-cluster differences in the accumulated rainfall metrics will be compared to intra-cluster differences to illustrate the relative importance of large-scale track variability and other factors (e.g. size and structure of Harvey’s precipitation field) in determining rainfall distribution. Additional statistical methods for characterizing forecast uncertainty of the rainfall hazards of Harvey will also be presented.