2.5 A Machine Learning Approach to Forecast Geopotential Heights using Historical Analogs as a Training Set

Monday, 23 January 2017: 5:00 PM
310 (Washington State Convention Center )
Christopher Lee Kuhn, Raytheon, Richardson, TX

A machine learning / analog forecasting tool developed with the R programming language and Shiny web application framework is presented. The prototype calculates features of current geopotential heights and correlates these features with features calculated from the NCEP_Reanalysis 2 dataset (provided by the NOAA/OAR/ESRL PSD). Isopleth curvature is the primary feature used in the correlation. Correlation is performed between the current observation and geopotential height in each six hour time period from 1979 through 2015. The tool displays a list of the top correlations and enables the user to view the geopotential heights from the dates of the top hits as well as the geopotential heights for the days following each top hit.

The top correlations compose a training set for use by the XGBoost machine learning algorithm. The XGBoost algorithm is then applied to current observations to produce a geopotential height forecast. The tool also creates an analog geopotential forecast by fusing the heights from the days following each top correlation. The machine learning forecast is compared to the analog forecast and validated against observations.

Metrics quantifying the approach’s accuracy, including how accuracy relates to the number of years available in the dataset, will be presented. The selection of a suitable geospatial bounding box to perform the correlation is also discussed.

The tool can increase confidence in forecasts derived from other methods. The tool can also serve as a training aid to rapidly acquaint someone with a region’s historical weather patterns.

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner