The classification of convective systems into morphological types or modes using radar mosaic imagery has most often consisted of manual processes, which can limit the practical sample size of a study of convective morphology or risk being inconsistent. However, recent advances in supervised machine learning have made automated classification methods from radar imagery increasingly reliable and practical. While these recent methods have shown promise in broad classifications, such as differentiating between mesoscale convective system (MCS) patterns and other non-MCS signatures, they have not yet seen widespread use for detailed classifications that discriminate between multiple subtypes of cellular and linear systems. One common scheme that offers these detailed subtypes is the nine-category scheme of Gallus et al. (2008,
Wea. Forecasting), which has three cellular types—isolated cells (IC), clusters of cells (CC), and broken lines (BL)—and five linear types—squall lines without a stratiform rain region (NS), lines with parallel (PS), leading (LS), and trailing (TS) stratiform regions, and bow echos (BE), alongside a non-linear (NL) type. This study seeks to create an automated version of this nine-category classification scheme through exploring two general categories of machine learning techniques—ensembles of decision trees and convolutional neural networks—and evaluating their reliability in performing this classification.
The two approaches used in this study both extend the successful broad classification methods of Haberlie and Ashley (2018, J. Appl. Meteor. Climatol.) for use with the more detailed nine-category scheme. Based on a set of manually-labeled training and testing data collected from NEXRAD composite reflectivity mosaics from 2004 to 2016, the first approach consists of using the same algorithms based on ensembles of decision trees (random forest, gradient boosting, and XGBoost) with both the morphological parameters used previously and new parameters meant to differentiate between the detailed modes (such as region connectivity, position of a stratiform region, or curvature of a characteristic curve of a system). The second uses largely the same method as Haberlie and Ashley for broad classification into cellular and linear types, and then applies separate convolutional neural networks to differentiate between the respective subtypes of cellular and linear systems. Using the manually-labeled testing data, validation statistics from these two approaches, including probability of detection and skill scores, and subjective evaluations will be shown to evaluate if either technique demonstrates enough reliability for potential future use as an automated procedure in research or operations.