Self-Organizing Maps: Map-Typing by Neural Network to Increase Reliability of High-Impact Weather Forecasting

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner
Tuesday, 4 February 2014: 8:30 AM
Room C204 (The Georgia World Congress Center )
Ryan A. Lagerquist, Environment Canada, Edmonton, AB, Canada; and A. Ling

Self-Organizing Maps: Map-Typing by Neural Network to Increase Reliability of High-Impact Weather Forecasting


Ryan Lagerquist, Alister Ling


Many types of high-impact weather (HIW) are short in duration and local in extent, which means that they are not always adequately predicted by physical atmospheric models (essentially, such HIW processes occur on scales smaller than the model resolution).  One approach for dealing with these inadequacies is to use machine-learning, or artificial intelligence, to post-process output from the physical models, with the hopes of yielding better predictions for HIW.  Since the physical models generally yield better predictions of synoptic-scale pressure and height fields than of mesoscale HIW phenomena, we post-process these pressure/height fields and use them as predictors for HIW.  Our project, at Environment Canada, uses Self-Organizing Maps (SOM) – which is a type of neural network – to predict various HIW phenomena based on characteristics of the large-scale (65-km resolution) fields of mean sea-level pressure (MSLP) and geopotential height (GPH) at various levels.


The HIW phenomena that we predict are visibility, visibility in the absence of precipitation, freezing drizzle, and accumulated precipitation.  The first two phenomena (visibilities) are important mainly to aviation forecasters, especially at TAF (terminal aerodrome forecast) sites.  “Visibility in the absence of precipitation” is used to predict low-visibility cases caused by other processes, generally fog and low-level stratus, which are rarely reported in weather observations (whereas precipitation is reported much more often).  Once SOMs have been trained offline to predict these HIW phenomena, they are placed in online mode, in which they post-process the results of each 6-hourly GEM Regional (Environment Canada's operational model) run to yield predictions for sites over a broad area.  This area includes four domains, each of approx. 1 million km2: southern Alberta, the Mackenzie River Valley, southern Baffin Island, and southern Ontario.


In offline training of the SOMs, we use the batch algorithm of Vesanto et al [1].  In initial experiments we used the unmodified version of this algorithm, in which case we were comparing absolute values of MSLP and GPH, using the Euclidean distance as a similarity measure.  However, many of the map types had a large amount of variance – i.e., the individual cases assigned to a single pattern were very different from one another – which led to poor predictions.  Based on these initial failures and the fact that gradients of MSLP and GPH are more fundamental to weather phenomena than their absolute values, we decided to train SOMs with the 2-D gradients of these fields.  We developed our own similarity metric to compare these gradient fields (henceforth, the “gradient distance”).  For the magnitudes of the gradient vectors, we use a log normalization, so that gradients at one level are not weighted disproportionately to those at another.  Lastly, since the number of neurons in the SOM layer is set higher than the desired number of patterns (based on suggestions by Ultsch and Mörchen [2]), agglomerative hierarchical clustering is used to decrease the number of patterns after SOM-training.


For each of the four domains, a set of 12 SOMs is trained, one for each month.  This ensures that seasonality is implicitly accounted for – e.g., if an MSLP/GPH pattern conducive to severe thunderstorms in July appears in January, severe thunderstorms will not be predicted.  At this point the historical MSLP/GPH fields for each month and each domain have been categorized into a smaller number of typical patterns, or “map types”.


            The next step is to correlate each map type with the intended HIW phenomenon.  Each map type is comprised of a number of historical cases, and for each historical case the time of occurrence is tracked.  If the HIW phenomenon is freezing drizzle or visibility (which are instantaneous measurements, rather than accumulated over time; this is because only [non-]occurrence of freezing drizzle is recorded, not accumulations), to correlate Map Type k with the HIW phenomenon, it suffices to find the HIW field concurrent with each MSLP/GPH field assigned to the map type.  Meanwhile, for accumulated precipitation, the 6-hour total is found centered on each MSLP/GPH field – i.e., from 3 hours before to 3 hours after.


            We then use the HIW fields associated with each map type to calculate statistics at each weather station.  For accumulated precipitation and visibility, the statistics are: minimum, maximum, average, frequency of severe (defined by Environment Canada warning criteria: 20 mm / 6 h for precipitation, 0.5 mi for visibility), and frequency of sub-severe (10 mm / 6 h and 1.0 mi, respectively).  For freezing drizzle, since only (non-)occurrence is measured, the statistic calculated is frequency of occurrence at each station.


            With the SOMs trained and correlated, they are now run in online mode.  Forecast MSLP/GPH fields from the GEM Regional, at 3-hour intervals up to 54 hours, are used to find the 3 nearest map types.  Then, based on these 3 map types, HIW statistics are plotted on an internal website.  This is done for each domain, each analysis hour, and each HIW phenomenon.  Furthermore, we have developed an objective warning scheme based on the raw HIW predictions, to determine when forecasters need to be yellow-flagged or red-flagged (for reasonable probability of a sub-severe or severe event, respectively).  These flags are shown in convenient pop-up tables for each domain on the homepage, so that forecasters can properly focus their attention on the “trouble areas”.


This SOM tool has high “glance value” for forecasters and has been used successfully to aid in forecast decisions.  The methodology will be presented along with one or two case studies of successful use at Environment Canada.


[1] Vesanto, Juha, et al. "Self-organizing map in Matlab: the SOM toolbox." Proceedings of the Matlab DSP conference. Vol. 99. 1999.


[2] Ultsch, Alfred, and Fabian Mörchen. "ESOM-Maps: tools for clustering, visualization, and classification with Emergent SOM." (2005).