Tuesday, 22 January 2008: 2:00 PM
Regression-based methods for finding coupled patterns
219 (Ernest N. Morial Convention Center)
Poster PDF
(904.6 kB)
There are a variety of multivariate statistical methods for analyzing the relations between two data sets. Two commonly used methods are canonical correlation analysis (CCA) and maximum covariance analysis (MCA) which find the projections of the data onto coupled patterns with maximum correlation and covariance, respectively. These projections are often used in linear prediction models. Redundancy analysis and principal predictor analysis construct projections that maximize the explained variance and the sum of squared correlations of regression models. This paper shows that the above patterns methods are equivalent to different diagonalizations of the regression between the two data sets. The different diagonalizations are computed using the singular value decomposition of the regression matrix developed using data that is suitably transformed for each method. This common framework for the pattern methods permits easy comparison of their properties. Principal component regression is shown to be a special case of CCA-based regression. A commonly used linear prediction model constructed from MCA patterns does not give a least-squares estimate since correlations among MCA predictors are neglected. A variation, denoted LSE-MCA, is suggested that uses the same patterns but minimizes squared error. Since the different pattern methods correspond to diagonalizations of the same regression matrix, they all produce the same regression model when a complete set of patterns is used. Different prediction models are obtained when when an incomplete set of patterns is used, with each method optimizing different properties of the regression. Some key points are illustrated in two idealized examples, and the methods are applied to statistical downscaling of rainfall over the Northeast of Brazil.
Supplementary URL: