8.6 Development of Radar-Identified Storm Cell and Track Dataset for Storm Motion Distributions and Machine Learning Applications

Wednesday, 15 January 2020: 11:45 AM
Dianna M. Francisco, Univ. of Oklahoma/CIMMS and NOAA/NSSL, Norman, OK; and T. M. Smith, K. M. Calhoun, and P. A. Campbell

Machine learning applications often incorporate data from large auto-generated datasets, and it is important to examine the physical science of the data for its practical use before feeding it into a machine learning algorithm. An automated storm-identification and tracking algorithm that is a part of the WDSS-II system at NSSL was used to define and associate individual storm cells and attributes from spatial grids over time. This identification (segmentation) and tracking (motion) algorithm was applied to the Multi-Year Reanalysis of Remotely Sensed Storms (MYRORSS) dataset, an archived and quality-controlled dataset of Multi-Radar Multi-Sensor (MRMS) products. Specifically, the MRMS Merged Composite (maximum) Reflectivity product or the Reflectivity at the -10°C isotherm product was used to identify individual storm cells and produce an output of subsequent cell attributes, such as cell speed and a cell’s maximum value of vertically integrated liquid (max VIL). Storm cell ID (and severity) was defined by a minimum threshold of merged composite reflectivity. Individual cells were tracked over time, and track statistics were recorded, such as track duration (in seconds).

The storm-identification and tracking algorithm is comprised of multiple tunable parameters, including a smoothing filter, cell matching method, and minimum and maximum thresholds of reflectivity. These tunable algorithm parameters were altered, one at a time, for the purpose of creating more accurate tracks in time and space. Each change in the algorithm parameters led to an additional dataset of individual cell IDs and tracks; therefore, an ensemble of different tracking algorithms was examined. Tracking statistics were computed to qualitatively examine the cell tracks throughout CONUS for one convectively active month (April 2011), and make comparisons with tracks produced from different tracking algorithms. These tracking statistics, including median duration, linearity error, and mismatch error (e.g., max VIL discontinuity), were factors in choosing the “best” tracking algorithm. A subset of storms were evaluated to manually identify cells and tracks, and then comparisons were made with the corresponding cell IDs and tracks produced by the "best" automated storm-identification and tracking algorithm, which highlighted the data quality gap for machine learning applications.

MRMS products were available throughout CONUS for multiple years in the MYRORSS dataset, which allowed for a robust sample size of severe convective storm cells in various environments and storm modes. This "best" tracking algorithm was applied to multiple years of MYRORSS data to develop a cell track dataset for investigation. Evaluating the storm motion vectors of all cells in the dataset gave an estimation of the storm motion distributions for severe convective storms cells throughout CONUS. Future work includes using this multi-year cell track dataset to examine storm cell longevity and as a training dataset for machine learning algorithms.

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner