1.2 Real-Time and Climatological Storm Classification Using Support Vector Machines

Monday, 8 January 2018: 9:00 AM
Room 7 (ACC) (Austin, Texas)
Eli Jergensen, CIMMS/Univ. of Oklahoma, Norman, OK; and A. McGovern, C. Karstens, H. Obermeier, and T. Smith

Classification of storms is an extremely important part of weather forecasting, as knowledge of storm type assists in predicting features such as storm duration, precipitation, and damage potential. Storm classifi- cation also enables the analysis of long-term trends in weather patterns. Using characteristics drawn from radar data provided by the Multi-Year Reanalysis of Remotely Sensed Storms (MYRORSS) and Near Storm Environment (NSE) data, our goal is to develop an automated storm classification algorithm using machine learning techniques that can classify both in real-time to aid forecasters and in retrospect to accurately build a climatology of storms across the continental United States.

This talk is part 2 of two highlighting two different machine learning approaches to classification. Here we focus on Support Vector Machine (SVM) classification systems using Linear and Radial Bias Function (RBF) kernels. The general method for training a SVM consists of three steps. First, labeled data is partitioned into training and testing data at a 2-to-1 ratio. To prepare the training data for learning, Principal Component Analysis (PCA) is performed on data scaled independently in each variable to zero mean and unit variance. Only the top 50 components are kept, lowering the dimensionality by a factor of 5. Next, Cross Validation selects the optimal hyper-parameters for each SVM kernel and the optimal SVMs are trained on the prepared data through supervised learning. Lastly, each trained model classifies the testing data, which is transformed in the same manner as the training data, building a confusion matrix from the resulting predictions, from which a Peirce Skill Score is easily computed.

Presently, we find that both SVM kernels produce Peirce scores in the mid 0.50’s. SVMs with linear kernels score up to 0.57, while SVMs with RBF kernels score up to 0.53. Additionally, performing PCA before training a model greatly accelerates the training process and yields a much better Peirce score (+0.18 on average). Future work may investigate Deep Learning techniques such as Neural Networks to see if they produce higher scores.

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner