221
Analysis of Month-to-Month Precipitation Variability Patterns for Los Angeles, San Diego, and San Francisco (1877–78 through 2012–3 seasons) Utilizing K-Means Clustering Analysis Integrated with the V-Fold Cross Validation Algorithm
Analysis of Month-to-Month Precipitation Variability Patterns for Los Angeles, San Diego, and San Francisco (1877–78 through 2012–3 seasons) Utilizing K-Means Clustering Analysis Integrated with the V-Fold Cross Validation Algorithm
- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner
Monday, 3 February 2014
Hall C3 (The Georgia World Congress Center )
Long-term monthly averages are a traditional means of characterizing climatological precipitation variability over the course of a rain year. Frequently based on the 30-year period of record, they serve as monthly precipitation ”normals” which are the basis for anomaly calculations. Such “normals”, however, are only statistical idealizations, and actual individual years' contiguous month-to-month rainfall patterns invariably depart from these means in a variety of ways. Inherent tendencies may exist for occasional clustering of wet or dry anomalies over preferred groups of months, either in general or during El Nino or La Nina episodes. Information on such tendencies. If real, would represent a useful complement to the more conventional single-month statistics. To explore these possibilities, this study investigate the existence and relative frequencies of contiguous monthly precipitation anomaly modes for three California localities with lengthy periods of record : Los Angeles, San Diego, and San Francisco . The K-means clustering analysis methodology integrated with the V-Fold Cross Validation Algorithm is applied. The V-fold Algorithm , a “training-sample” data mining procedure, allows for a more objective determination of the optimal number of clusters when incorporated into K-Means. The raw data are normalized by month, and both Euclidean and Squared Euclidean distance methodologies are utilized to form the clusters.
Periods of record examined for each station are the 1877-78 through 2012-13 July-June rain years. Given the winter rainfall maximum/summer drought character of California coastal stations, the monthly selection includes October-November, December, January, February, March, and April-May. The nature of the modes and their frequencies relative to El Nino, Neutral, and La Nina episodes are also described and analyzed.