J4.6 Automated Classification of Convective Areas from Radar Reflectivity Using Decision Trees (2008

19th Conference on Probability and Statistics
Sixth Conference on Artificial Intelligence Applications to Environmental Science

J4.6

Automated Classification of Convective Areas from Radar Reflectivity Using Decision Trees

David John Gagne II, University of Oklahoma School of Meteorology, Norman, OK; and A. McGovern and J. Brotzge

This paper presents an automated approach to classifying storms based on their radar reflectivity structure using decision trees. An automated system can quickly sort through large quantities of data. A decision tree returns a human readable model that selectively identifies the most important attributes of the data. Our method of storm classification combines two machine learning techniques, k-means clustering and decision trees. K-means segments reflectivity data into clusters and decision trees classify each cluster.

A k-means clustering algorithm is used first to divide the radar reflectivity data into different spatial regions. Then each cluster is sorted into one of two general categories as either convective or stratiform based upon its reflectivity magnitude and shape. If convective, each cluster is classified as cellular or a linear system. If cellular, then each system is further classified as isolated weak, isolated strong, and multicell. If a linear system, then each cluster is classified as either trailing stratiform, leading stratiform, or parallel stratiform. The Waikato Environment for Knowledge Analysis (WEKA), a machine learning suite, was used to develop the decision trees.

Multiple decision trees were constructed with both morphological and reflectivity attributes for both the general and specific classifications. A training data set was developed from simulated reflectivity data as output from the Advanced Regional Prediction System (ARPS) model. The decision tree was then tested using actual radar observations from the CASA IP1 network. Overall, the accuracy for the general classification scheme was estimated at 90%, indicating a very reliable general classification tree. For the more specific classification technique, the accuracy ranged from 55% to 80% across the data test sets, implying additional work is needed for improvement. Nevertheless, the general classification scheme is now ready for operational implementation, having been verified using operational radar data.

Extended Abstract (556K)

wrf recording Recorded presentation

Joint Session 4, Bridging the Gap between Artificial Intelligence and Statistics in Applications to Environmental Science-II
Wednesday, 23 January 2008, 10:30 AM-12:00 PM, 219

Previous paper

Browse or search entire meeting

AMS Home Page