Our current approach is a model with two distinct machine learning steps. First, we use the Self-Organizing Map (SOM) to extract features of climate variability from gridded synoptic scale atmospheric variables. Our second step uses a Random Forest classifier to partition this variability space to predict lightning versus non-lightning days. In other words, we predict lightning events solely in terms of climate variability, without weather information. Historical lightning observation data for 1986-2022 is obtained from the Alaska Lightning Detection Network (ALDN) which is operated by the Alaska Fire Service. Atmospheric variables are obtained from ECMWF Reanalysis v5 (ERA5) for 1959-2022. The variables explored in this study include: 500 hPa height, CAPE (convective available potential energy), convective precipitation (CP), sea level pressure, 2-meter temperature, 850hPa temperature, total column ice fraction, total column water vapor, total column cloud ice water, total column cloud liquid water, and 850-500hPA potential temperature difference.
While this model performs modestly as a diagnostic tool, achieving an F1-score of 0.797 predicting lightning days over central Alaska, the predictions for daily lightning intensity (high, medium, and low terciles of daily stroke counts) decrease – with the middle tercile performing as well as a random guess. We discuss this effect in the following paragraph. Furthermore, to achieve S2S predictions the model requires accurate multi-model ensemble (MME) forecasts as inputs. Variables such as CAPE, CP, and ice fraction are relatively complicated resulting in less certain MME predictions. For this reason, we want to work towards a model with fewer and simpler predictors.
An important feature of the current approach is the use of anomalies, which removes information of seasonality from the predictors. Meanwhile, the available daylight hours (amount of time in a day in which the sun is above the horizon) are an important intraseasonal control of the magnitude and variability of lightning events in Alaska, as demonstrated in Figure 1. The energy available to drive convection increases as daylight hours increase, not only increasing the mean daily lightning count, but also the variance. The consequence is that the variability of large scale dynamics do not result in the same variability of daily lightning across different length days. Controlling for this effect is necessary in order to extend this model to predictions of daily lightning intensity.
We propose to control for the effect of daylight hours on daily lightning counts by predicting the magnitude of lightning events relative to days of similar length rather than relative to the entire season. Recent work has shown seasonal predictability of summer fire weather in Alaska associated with the phasing of atmospheric and oceanic teleconnections, including the Pacific North American (PNA) pattern and Arctic Oscillation (AO) (Justino et al, 2022), and East Pacific / North Pacific (EP/NP) pattern (Zhao et al, 2022). We will examine the S2S predictability of lightning in Alaska with the PNA, AO, EP/NP, and Madden-Jullian Oscillation teleconnection indices as additional predictor variables in our statistical model to assess how the large-scale Pacific ocean-atmosphere conditions relate to lightning strikes as daylight varies through summer.

