The term canonical event is used because the motivation underpinning the definition of these events is similar to the motivation underpinning the statistical method known as canonical correlation analysis (CCA). CCA is a way of measuring the linear relationship between two multidimensional variables. It finds two bases, one for each variable, that are optimal with respect to correlations and, at the same time, it finds the corresponding correlations. In other words, it finds the two bases in which the correlation matrix between the variables is diagonal and the correlations on the diagonal are maximized. The dimensionality of these new bases is equal to or less than the smallest dimensionality of the two variables. EPP3 does not use CCA explicitly for several reasons. First, CCA is a linear technique, but a non-linear approach to CCA can be constructed using variable transformations between values in data space and feature space so that CCA can be done in feature space with results converted back to data space. That approach could potentially be used with EPP3 but that would require additional work to represent precipitation as a continuous variable where ranks would be assigned to zero values of precipitation forecasts and observations. Second, the optimal CCA basis elements depend on forecast variables, forecast sources, time of year and forecast location. There is no universal, optimal set of CCA basis functions. But it would be interesting to apply CCA to data sets used to estimate EPP3 parameters to see what insights could be gained to improve how EPP3 processes the available data. That has not been done.
Two types of canonical events are used in EPP3: base events and modulation events. Base events form a concatenated non-overlapping set of time intervals that span the maximum length of forecast period. Modulation events include 2 or more base events that over-lap each other. Separate sets of canonical events are defined for precipitation and temperature. The time step for precipitation events is 6hrs. The time step for temperature events is daily (max, min).
The canonical events used in EPP3 apply only to temporal definition of events over a forecast time horizon. Future work is needed to add a spatial dimension to canonical event definitions that may potentially improve how uncertainty in spatially distributed events, especially precipitation is represented. At present, EPP3 relies on the Schaake Shuffle to maintain spatial consistency among ensemble members. But the limitations of this approach are untested. Improvements to the way EPP3 handles uncertainty in space will probably require a gridded, as opposed to basin, approach ensemble generation. This will require gridded precipitation/temperature analyses (e.g. the proposed analysis of record) and an approach to relate such gridded analyses to the basin-based MAP/MAT time series used for hydrologic model calibration. Such a spatial approach to canonical event definition would not necessarily replace the Schaake Shuffle completely but it could a better constraint on how it is used.
An example application to precipitation forecasts for the North Fork of the American River in California will be presented. This will include illustration of the important role of modulation events in improving some of the verification statistics. It will be shown that the proposed application of canonical events offers a seamless approach to using weather and climate forecasts from multiple forecast sources for different forecast periods and locations as input to produce reliable hydrometeorological ensemble forcing for hydrological ensemble forecast systems.