The 2017 Statistical Model for United States Annual Lightning Fatalities

Roeder, William P.; Roeder, William P.

An update is presented for the curve-fitting model that predicts the expected number and 95% confidence interval of lightning fatalities in the U.S. for any year and the number expected by any day within that year. The update adds the most recent annual U.S. lightning fatalities for 2014-2016. In addition, some older data that is no longer representative of the current U.S. lightning fatality rate is excluded. Finally, a new graphic display is presented that allows easy interpretation of the predictions versus observed lightning fatalities during the year.

In addition to the update to the curve-fitting model, two apparently discontinuous changes in the trend of U.S. annual lightning fatalities were identified that occurred in 1994 and 2008. Accounting for these changes and excluding prior data was the main motivation for updating the model. The new model presented in this paper is valid for the current regime of U.S. annual lightning fatalities that began in 2008.

The logistical regression model for how the U.S. lightning fatalities accumulate throughout the year is also updated to include 2006-2016 (dates for lightning deaths before 2006 not available). This model appears to have stabilized with only small incremental changes as a result of the new data. The new model shows that the median of the U.S. lightning fatality season in the U.S. is 13 July.

The model uses best-fit curve fitting to the declining trend of U.S. annual lightning fatalities. This approach has a significant advantage over the typical approach of a running 30-year mean for sources of weather fatalities. A running mean is not representative of the current year. For example, a 30-year running mean has an inherent 15 year lag. In practice, the lag is usually 1-2 years longer to allow for data collection, analysis, and dissemination. The 30‑year running mean is a good approach for stationary data (no trend over time), but it leads to an overestimate for the current rate under a declining trend, such as U.S. lightning fatalities. For example, the most recent 30-year (1986-2015) running mean is 48 U.S. lightning deaths per year. This contrasts with the new statistical model prediction of 25.5 lightning fatalities for 2017 with a 95% confidence interval of 18.3 to 34.7 deaths. Note that the 30-year running mean is outside the 95% confidence interval and so is statistically significant at the 95% significance level. While the running 10-year (2006-2015) running mean of 31 fatalities is more consistent with the curve-fitting model, being inside its 95% confidence interval, the curve-fitting approach is still preferred since the 10‑year running mean is very sensitive to any extreme events during that period. In addition, a 10-year running mean still has an inherent 5‑year lag, which is actually 6-7 years in practice. In addition, the running-mean approach does not provide error bars to allow assessment if differences between the prediction and observation are statistically significant, though the error bars could easily be calculated from the same data used in the running means. The curve-fitting approach overcomes all the shortfalls of a running mean approach: it is valid in the year of interest, while being consistent with the past years, is relatively insensitive to extreme values, and provides error bars.

J11.6 The 2017 Statistical Model for United States Annual Lightning Fatalities