Statistical scenario forecasting

Veeramachaneni, Sriharsha; Veeramachaneni, Sriharsha

Many applications require the ability to judge uncertainty of time-series forecasts. Uncertainty is often specified as point-wise error bars around a mean or median forecast. Due to temporal dependencies, such a method obscures some information. We would ideally have a way to query the posterior probability of the entire time series given the predictive variables, or at a minimum, be able to draw samples from this distribution. We use a Bayesian dictionary learning algorithm to statistically generate scenarios (alternative forecasts) given a small set of seed forecasts (from a physics-based model). Although our statistical approach generates scenarios about as well as the physics-based method, it has the advantages that it can be better calibrated, enables a computationally cheap way to generate a large number of scenarios, and can be applied to variables for which physical models are absent. The main disadvantages are that, being a statistical approach, it is limited by the scenarios observed in the training data, and the possibility that the scenarios violate known physical constraints.

One approach taken for weather forecasting is to generate several "scenarios" or realizations of the time series, such that each realization satisfies the dependencies that are known to exist. If the scenario forecasting method draws from the posterior distribution, we may use it to answer complex queries about the forecast (e.g. compute the probability that there will be at least 1 inch of rain and dew point is less than 60 degrees), which is impossible to do with simple error bars or predictive intervals on each of the variables.

The approach we explore here is called dictionary learning: it is a statistical method to learn "interesting" directions (or a basis) to compactly summarize high-dimensional data. Dictionary learning with sparse over-complete representations has been applied extensively in image and video processing. We learn a dictionary jointly for the target time series to be predicted and any available predictor variables, using a recently proposed Bayesian dictionary learning algorithm. At prediction time, we draw samples from the conditional distribution of the target time series given the observed predictors and the dictionary. Each of these samples is a scenario, and from this ensemble of scenarios the probability of any event of interest can be estimated. We evaluate our statistical scenario generation method by comparing to the SREF scenarios for the temperature in Houston using the Minimum Spanning Tree Distance Rank Histogram. The preliminary experimental results presented show that the method holds promise for scenario generation.

218049 Statistical scenario forecasting