13B.2 The Development of a Single-Radar Tornado Prediction Algorithm Using Machine Learning

Thursday, 31 August 2023: 10:45 AM
Great Lakes A (Hyatt Regency Minneapolis)
Thea Sandmael, NSSL, Norman, OK; and R. B. Steeves, Z. Fruits, I. Schick, M. Ake, Z. A. Cooper, J. Widanski, Q. Thomas, and R. Galang

With the completion of the development of the CIWRO/NSSL single-radar tornado probability algorithm (TORP) for detecting tornadoes, a natural next step is to tackle the problem of tornado prediction using similar machine learning techniques. While data is readily available through tornado reports from the NCEI Storm Events Database for the detection problem, a labeled dataset of pre-tornadic storms is not currently publicly available. To remedy this, a team of CIWRO/NSSL undergraduate students at the OU School of Meteorology have gone through a multi-year effort to identify pre-tornadic storm locations up to an hour before reported tornadogenesis from the start of the dual-polarization era to the present. Over 40,000 pre-tornadic data points have been created and associated with single-radar data.

After the training and testing of a machine learning model using this dataset, pre-tornadic probabilities are now included with the TORP base algorithm to create an object-based future tornado potential trend. This will allow two probability outputs from TORP; one for detection, and one for prediction. In addition to the new prediction model that is trained on the same instantaneous radar predictors as TORP’s detection model, we are developing extra models based on predictors derived from supplemental data that will be used internally in TORP. These models will include predictors based on 1) the trend of the radar predictors, such as maximum azimuthal shear or the maximum convergent shear, which will be used based on the availability of an object track history, and 2) environmental variables, which will be contingent upon the availability of supplied environmental model data. The TORP base algorithm will automatically select and interchange between the various models based on the availability of the three types of data products. Whether to include these models will depend on whether a significant performance gain is observed with the added data, which will be weighed against computational and temporal considerations. In real-time applications, it is critical to minimize any potential time lag so that the algorithm can be useful to the operational community, even if the additional data potentially provides a more informed tornado prediction.

This presentation will provide an overview of the prediction algorithm's design and its current state. We will discuss the pre-tornadic dataset generation and the methods used to train and test the new model(s), and present preliminary results concerning the new random forest models' objective performance in predicting tornadoes up to an hour preceding tornadogenesis.
- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner