614 Identification of Midnight-to-Midnight Hourly Wind Pattern Modes Utiilizing the V-Fold Cross Validation Algorithm Applied to K-Means Clustering Analysis

Wednesday, 9 January 2013
Exhibit Hall 3 (Austin Convention Center)
Charles Fisk, Naval Base Ventura County, Pt. Mugu, California
Manuscript (916.3 kB)

Climatological wind variability is an important meteorological element to be considered in planning and decision- making activities in which weather conditions are crucial on some level. Wind rose diagrams, for example, can provide insights into the wind character for individual hours of interest by depicting the most favored compass directions and associated speeds. Resultant wind calculations can be valuable in producing distilled single-value statistics derived from many different individual observations. Also interesting and potentially insightful should be information on the most prominent contiguous hour-to-hour wind patterns that occur climatologically at specified times of the year. Such results should be a logical complement to the more individual-hourly focused statistics. In the same manner as there are favored individual hourly directions and related speeds, there should be preferred, collective hour-to-hour patterns, or “modes”. Resolving patterns of this kind could be considered a clustering problem, and K-Means Clustering Analysis is frequently utilized for such problems. One drawback or “nuisance factor” associated with traditional K-means is that the researcher has to guess how many clusters there are in advance, the ultimate choice of how many there “are” requiring trial-and-error iterations combined with subjective judgment. Recent statistical methodological advances, however, have resulted in adaptation of the V-fold Cross-Validation Algorithm , a training-sample type procedure, which incorporated into K-Means allows for a more objective determination of the “right” number of clusters.

As a demonstration of this tool, the optimal number of midnight-to-midnight contiguous hourly wind patterns is performed for a first-order station (La Guardia Airport, New York – 1949 to 2011 data), for a selection of calendar months (January, April, July, and October; plus all the calendar months as a unit ), Data input consists of hourly u and v wind components, creating a clustering problem in 48-dimensional space. After generation of the clusters, the mean u's and v's within each cluster are reconstructed into 24 individual hourly resultant wind statistics (direction, speed, and constancy). To facilitate results' interpretation in the summary analysis, mean cluster hourly temperature and relative humidity statistics are also included. Results resolved four clusters for January, five for April, July, and October; and six for all the months as a single unit.

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner