11.4 A Classification Model for Daily Lightning Intensities in Alaska

Thursday, 4 May 2023: 2:15 PM
Scandinavian Ballroom Salon 4 (Royal Sonesta Minneapolis Downtown )
Joshua Hostler, Univ. of Alaska Fairbanks, Fairbanks, AK; and U. S. Bhatt, P. Bieniek, E. Fischer, T. J. Ballinger, C. Borries-Strigle, R. Thoman, R. Lader, M. Burgard, C. F. Waigl, J. Chriest, H. Strader, E. Stevens, and Z. Parish

In 2015 Alaska experienced one of the most extreme early fire seasons on record with 99% of acres burned attributable to lightning. Subsequently, the 2022 fire season became the earliest on record to reach 1 million acres burned – again largely driven by lightning activity. Fire management would benefit from skillful outlooks on lightning likelihood to use for planning. We use Self-Organizing Maps (SOMs) as an input layer for a random forest classifier to predict daily lightning likelihoods.

A Self-Organizing Map is an unsupervised clustering algorithm that, when applied to gridded atmospheric variables, determines common daily weather patterns (the attached figure shows a trained SOM for 500hPa geopotential height anomalies). We train SOMs on ECMWF Reanalysis v5 (ERA5) daily anomalies for the months June and July from the years 1959 to 2022 for the following variables: 500hPa geopoential height, sea level pressure, 2 meter temperature, 850hPa temperature, convective available potential energy, and total column water vapor.

In this study, we classify days as low, medium, and high by tercile of daily lightning counts computed from the Alaska Lightning Detection Network historical lightning dataset. Each daily record is associated with a weight in the 2D SOM network. This reduces the dimensionality of the raw gridded data from about 200,000 - 34,001 pixels and 6 variables - to just 12. These results are then used to train a random forest classifier which uses an 80-20 train test split and 5-fold cross-validation for hyperparameter tuning. Table 1 is the confusion matrix for the test dataset.

Predicted

Low

Middle

high

Actual

Low

73

28

24

Middle

34

37

47

High

12

19

81

Our model shows skill in classifying low and high tercile lightning days, with mean AUROC and F-1 scores of 0.7 and 0.53, respectively (climatology results in scores of 0.5 and 0.33 respectively). Classification of middle tercile days marginally outperforms the baseline scores. True upper tercile are correctly predicted at the highest rate. Table 2 summarizes the classification metrics for each class.

Class

Precision

Recall

F1-Score

AUROC

Low

0.613

0.584

0.598

0.767

Medium

0.440

0.314

0.366

0.559

High

0.533

0.723

0.614

0.785

Mean

0.529

0.540

0.526

0.704

Test scores improved over validation suggesting the model will perform similarly given new data. Future work will conduct an in-depth model evaluation including the examination of feature importance and identifying sources for model error. We also plan to apply this methodology to classify lightning-days from days without lightning. Finally, we plan to employ this model with seasonal dynamical forecasts to construct a multi-model seasonal outlook.

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner