89th American Meteorological Society Annual Meeting

Monday, 12 January 2009
Combining automated and human predictions: the results of a 1000-day real-time trial
Hall 5 (Phoenix Convention Center)
Harvey Stern, Bureau of Meteorology, Melbourne, Vic., Australia
Poster PDF (589.1 kB)
There is the accepted mathematical concept that two or more inaccurate but independent (or partially independent) predictions of the same future events may be combined to yield predictions that are, on the average, more accurate than either of them taken separately. Automated and human forecasts might be expected to "bring to the table" different knowledge sets, and this suggested the development of a weather forecasting system that mechanically integrated (combined) human and computer-generated predictions. Stern (2007a, 2007b) reported on the performance of such forecasts generated during a real-time trial between 20 August 2005 and 19 August 2006 and found that the combined forecasts did indeed perform better than the separate human and computer forecasts. It is considered that because a 'real-time' trial of a methodology involves evaluating forecasts that are generated prior to the event, the results of such a trial possesses greater validity than if the new methodology had been evaluated in an hindcasting mode (even with the application of sophisticated cross validation techniques).

Since, 20 August 2006, very long range forecasts have also been generated by combining computer predictions with climatology (climatology was used, given the absence of very long lead time human forecasts). Verification over a one-year period to 19 August 2007 (Stern, 2008), revealed that Day-8 forecasts so generated explained 11.2% of the variance, Day-9 forecasts explained 7.2% of the variance, and Day-10 forecasts explained 3.4% of the observed variance. However, for these very long range day-to-day forecasts, the variance explained was mainly for the temperature components. Specifically for Day-8, Quantitative Precipitation Forecasts (QPFs) explained 4.2% of the observed variance, whilst Minimum Temperature Forecasts (MINFs) explained 17.9% of the observed variance and Maximum Temperature Forecasts (MAXFs) explained 17.5% of the observed variance; for Day-9, QPFs explained 3.1% of the observed variance, whilst MINFs explained 10.4% of the observed variance and MAXFs explained 10.0% of the observed variance; and, for Day-10, QPFs explained 0.9% of the observed variance, whilst MINFs explained 7.7% of the observed variance and MAXFs explained 4.6% of the observed variance.

The mechanically combined forecasts continue perform strongly as testified by verification statistics derived from the 1,000 Melbourne Day-1 to Day-7 forecast sets generated by combining human and computer predictions between 20 August 2005 and 15 May 2008. For example, the accuracy of the 14,000 Melbourne Day-1 to Day-7 minimum and maximum temperature predictions so generated has been increased through agency of the mechanical integration process, with the Mean Square Error (MSE) of the mechanically integrated forecasts being 0.81 deg C lower than the MSE of the corresponding human (official) product. Similarly, the accuracy of the 7,000 Melbourne Day-1 to Day-7 rainfall forecasts so generated has also been increased by means of the mechanical integration process, mechanically integrated forecasts of whether or not it was going to rain being correct 6.6% more often than the corresponding human (official) product. Furthermore, the accuracy of the 7,000 Melbourne Day-1 to Day-7 thunderstorm forecasts so generated has also been increased by means of the mechanical integration process, the Critical Success Index (CSI) of the mechanically integrated forecasts of thunderstorms being 3.6% higher than that of the corresponding human (official) product. The accuracy of the 7,000 Melbourne Day-1 to Day-7 fog forecasts so generated has also been increased by means of the mechanical integration process, albeit only slightly, the CSI of the mechanically integrated forecasts of fog being 0.9% higher than that of the corresponding human (official) product.

The verification of the 1,000 Melbourne Day-1 to Day-7 forecast sets refers to an overall evaluation undertaken on the forecast performance with all lead times taken together. However, even when the evaluation was undertaken with lead times taken separately, a lift in accuracy occurred in most instances.

References:

Stern H (2007a) Increasing forecast accuracy by mechanically combining human and automated predictions using a knowledge based system, 23rd Conference on Interactive Information and Processing Systems, San Antonio, Texas, USA 14-18 Jan., 2007.

Stern H (2007b) Improving forecasts with mechanically combined predictions, Bulletin of the American Meteorological Society (BAMS), June 2007, 88:850-851.

Stern H (2008) Does society benefit from very long range day-to-day weather forecasts? Symposium on Linkages among Societal Benefits, Prediction Systems and Process Studies for 1-14-day Weather Forecasts, New Orleans, Louisiana, USA 23 Jan., 2008.

Supplementary URL: http://www.weather-climate.com