201 Machine Learning-based Dynamic Forecasting of Weather-induced Electric Outages

Monday, 29 January 2024
Hall E (The Baltimore Convention Center)
Tianqiao Zhao, Br, Upton, NY; and M. Yue, M. P. Jensen, S. Endo, and J. E. González-Cruz

Machine Learning-based Dynamic Forecasting of Weather-induced Electric Outages

As the trend in climate change continues, extreme weather events are expected to occur with increasing frequency and severity, posing a significant threat to the electric power infrastructure. Regardless of utilities' efforts in hardening the grid, damage to utility assets such as overhead cables and distributed energy resources (DERs) that are particularly vulnerable to such events is unavoidable. Having a highly granular outage forecasting tool with a long lead time will be a great advantage for resource allocation and service restoration.

Many studies [1] have been researched for damage forecast of power grids under various hazardous weather events by identifying correlations between grid damage occurrences and relevant weather conditions. However, these works rely on the specific design of prediction models and lack the comprehensive utilization of archived weather data for training purposes. In addition, existing research, while valuable, often lacks granular information of outages such as precise location and timing details. This data gap hinders strategic crew deployment and efficient response strategies. Enhancing outage prediction accuracy can greatly facilitate storm preparation and reduce restoration time and costs. Achieving better prediction granularity requires managing uncertainties tied to weather-induced outages. A study [2] introduced a regression-based outage model, but temporal dependency suffered due to weather averaging. The evolution of machine learning (ML) makes it a powerful tool for better predicting outages using different types of data from disparate sources. For example, [3] used long-short-term memory (LSTM) networks to handle temporal aspects in outage prediction, avoiding complex data preprocessing.

This study develop and implement an ML-based multi-model framework as an operational tool based on a dynamic, granular, multi-day electric outage forecasting model using numerical weather forecasts and high-resolution outage information. An innovative two-layered dynamic neural network (DNN)-based forecasting model and a sliding window approach are developed to make better use of the available data. To further improve the outage forecasting performance, we focus on two major issues that have significant negative impacts:

  1. The selection of weather variables as input to the ML model: When choosing weather variables as inputs for ML models, the abundance of options (hundreds of variables in weather forecast data) can lead to redundancy or high correlations, easily resulting in overfitting. Understanding each variable's impact on model predictions and the selection of appropriate input variables are crucial for developing a capable predictive model. To achieve this, we employ a model-agnostic approach to compute permutation-based variable importance assisted by domain expertise in meteorological science. This approach quantifies the shift in predictor error as each variable is permuted, disrupting the link between the feature and the model prediction. Variables are retained only if permuting their values notably increases prediction error.
  2. The uneven distribution of numbers of weather events of different categories and severities: A significant hurdle in constructing a comprehensive outage forecasting model arises from the unequal occurrence of diverse weather events. Real-world weather data often exhibits imbalanced class ratios, potentially introducing bias to the outage prediction model. This study tackles this challenge by employing variational autoencoder (VAE)-based techniques to rebalance the dataset, achieving data augmentation objectives. To capture the temporal progression of weather events, we further integrate the VAE model with an LSTM network. This hybrid approach aims to enhance dynamic forecasting performance using sequential weather data encompassing various weather scenarios. The trained VAE-LSTM model is leveraged to generate synthetic data, augmenting multivariate weather sequences effectively.

First, we conducted a study showing the influence of input variables. The described method for the selection of weather variables was used to obtain the permutation error for each feature. A higher model output error after permutating a variable indicates that the variable is more important and relevant to the output of the prediction model. Our results confirm that different variables (or features) have different importance and relevance to the output of the prediction model.

We compare the performance between the model with original input variables and the model with the selected variables. The results of an example case are shown in the included figure, which demonstrate that the reduced variables enhance the performance.

Second, we conduct one study to verify the performance of the VAE-LSTM model for data balance. For the balancing augmentation, by analyzing the dataset, we found that the weather data that caused more than 20 outages in one of the 24 subareas of the utility accounted for a small portion of the entire dataset (3%), resulting in imbalanced data. Therefore, we categorize the original data into 2 classes (i.e., >20 outages and <= 20 outages during the entire forecast horizon) and apply the augmentation algorithm to balance the original data. We find that the proposed data augmentation method can balance data classes by ranking and generating samples and avoid the generated samples lying too close to the decision boundary.

We consider three performance metrics: Mean Squared Error (MSE) and p-value that are averaged over 30 tests and Win/Tie/Loss numbers. The Win/Tie/Loss indicates 1) a model with better performance ('Win' or 'Loss') and 2) a similar performance ('Tie'), where 'Win' means the model with data augmentation is better, and vice versa. The p-value indicates the significant improvements statistically at a 5%-level. The MSE improves from 0.57% to 0.23% with data augmentation. The results show that the data augmentation method would significantly boost the prediction performance.

Reference

[1] Nateghi, Roshanak, Seth D. Guikema, and Steven M. Quiring. "Comparison and validation of statistical methods for predicting power outage durations in the event of hurricanes." Risk Analysis: An International Journal 31, no. 12 (2011)

[2] Yue, Meng, Tami Toto, Michael P. Jensen, Scott E. Giangrande, and Robert Lofaro. "A Bayesian approach-based outage prediction in electric utility systems using radar measurement data." IEEE Transactions on Smart Grid 9, no. 6 (2017)

[3] Alpay, Berk A., David Wanik, Peter Watson, Diego Cerrai, Guannan Liang, and Emmanouil Anagnostou. "Dynamic modeling of power outages caused by thunderstorms." Forecasting 2, no. 2 (2020)

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner