To overcome this problem, rotation of the principal components has proven useful. The classical rotation criteria used in climatology are based on the general concept of "simple structure" which can provide spatially or temporally localizated components [1]. In this work, we present several techniques which can be used for rotation of principal components. The proposed exploratory algorithms follow a common framework, which allows for finding underlying signals with suitable (interesting, desired) properties. The algorithms alternate between estimating the interesting structure of extracted signals and using this structure to find new signal estimates.
This type of analysis can be particularly efficient in problems when some prior information (e.g., the general shape of the time curves of the relevant components or their frequency contents) exists. For example, in climate data analysis we might be interested in some phenomena that would be cyclic over a certain period, or exhibit slow changes. Then, exploiting the prior knowledge may significantly help in finding a good representation of the data.
We use the presented tools for exploratory analysis of the large spatio-temporal dataset provided by the NCEP/NCAR reanalysis project [2]. The data is the reconstruction of the daily weather measurements around the globe for a period of 56 years.
The first proposed technique concentrates on slow climate variations. We show that optimization of the criterion that we term clarity helps find the sources exhibiting the most prominent variability in a specific timescale [3]. In the experiments, three major atmospheric variables including surface temperature, sea level pressure and precipitation were analyzed. The components exhibiting the most prominent interannual variations are clearly related to the well-known El Nino-Southern Oscillation (ENSO) phenomenon. The time course of the most prominent component extracted from the three datasets is a good ENSO index, and the corresponding spatial patterns have many features traditionally associated with ENSO. Interestingly, the second component extracted from the dataset combining the three variables somewhat resembles the derivative of the ENSO index.
This technique can easily be tuned to find phenomena exhibiting prominent behavior in other timescales (e.g., trends, seasonal variations etc). We also introduced an extended technique which rotates the slow components based on their frequency contents. The sources found in the corresponding experiments give a meaningful representation of the slow climate variability as combination of trends, interannual oscillations, the annual cycle and slowly changing seasonal variations. Again, components related to the ENSO phenomenon emerge very clearly among the found sources.
Another technique developed in this work seeks climate phenomena that would correspond to prominent fluctations of intensities (variances) of weather variations in a specific timescale [4]. Fast changing temperature components whose variances have prominent annual and decadal structures are extracted. The extracted annual components reflect higher temperature variability over the continents during winters. The components with slower changing variances might correspond to some interesting weather phenomena characterized by slowly changing temperature variability in specific regions.
The considered techniques can easily be modified by emphasizing other types of interesting (temporal or spatial) structures or by taking into account available prior knowledge. An interesting approach motivated by the results of the frequency-based analysis is to seek groups of signals which would share common and predictable dynamics [5]. Such technique might allow for finding complex climate phenomena with the most predictable time course.
The experimental results obtained for the analyzed datasets are very promising. However, the meaning of the results needs to be further investigated, as some of the found components may correspond to significant climate phenomena while others may reflect some artifacts produced during the data acquisition. A third alternative would be that the components may have been overfitted to the data, which could be tested by cross-validation.
[1] M. B. Richman. Rotation of principal components. Journal of Climatology, 6:293-335, 1986.
[2] E. Kalnay and coauthors. The NCEP/NCAR 40-year reanalysis project. Bulletin of the American Meteorological Society, 77:437-471, 1996.
[3] A. Ilin, H. Valpola, and E. Oja. Exploratory analysis of climate data using source separation methods. Neural Networks, 19(2):155-167, 2006.
[4] A. Ilin, H. Valpola and E. Oja. Extraction of Components with Structured Variance. In Proc. of the IEEE World Congress on Computational Intelligence, WCCI 2006, pp. 10528-10535, Vancouver, BC, Canada, July 2006.
[5] A. Ilin. Independent Dynamics Subspace Analysis. In Proc. of the the 14th European Symposium on Artificial Neural Networks, ESANN2006, pp. 345-350, Bruges, Belgium, April 2006.
Supplementary URL: http://www.cis.hut.fi/alexilin/climate/