Tracking of Wind-Wave Systems Using K-Means Clustering

Van der Westhuysen, Andre Jaco; Van der Westhuysen, Andre Jaco

Third-generation spectral wind wave models such as WAVEWATCH III (Tolman et al. 2016) produce output with a very large number of degrees of freedom (100M to 1B per time step on a typical model grid). To reduce this large amount of information, while retaining details of complex wave fields, wave spectrum partitioning algorithms have been developed to group significant wave components such as swells and wind sea. This partitioned model output is increasingly being applied to provide targeted forecasting for specific marine activities (e.g. wave response of large commercial ships alongside conditions for small recreational craft). However, these partitioning algorithms operate locally in geographical space, independently at each grid point. As such, the coherence of the derived swell and wind sea partitions in geographical space and time cannot be guaranteed. In the present study, we propose an unsupervised machine learning approach for combining these independently-computed wave partitions into spatially and temporally consistent wave systems. This task is cast as a clustering problem, which is solved using the K-Means algorithm from Python’s Scikit-Learn package. The input features to the clustering operation are the component significant wave height, period and direction of each computed partition. The data records comprise the feature values of each partition at each geographical location, at each time step in the wave model simulation. The clustering operation therefore groups partitions with similar wave height, period and direction over the model domain, and in time, and assigns them the same label. These, collected in geographical space and time, represent the identified wave systems. Since the number of wave systems for each model run is not known a priori, and since this is an input parameter of the K-Means algorithm, the analysis is repeated with a range of K values and the one yielding the highest silhouette coefficient is selected. This procedure has been incorporated into the post-processing of the National Weather Service’s operational Nearshore Wave Prediction System, which covers all US coastal waters. Examples of its favorable model performance are presented for a range of operational conditions.

References

Tolman, H.L., et al (2016) The WAVEWATCH III development group (WW3DG), 2016: User Manual and system documentation of WAVEWATCH III version 5.16., College Park, MD, USA. NOAA/NWS/NCEP/MMAB Tech. Note 329:326

10.4 Tracking of Wind-Wave Systems Using K-Means Clustering