S5 Classification and Predictive Modeling of Oceanography Data Using Data Mining Techniques

Sunday, 6 January 2019
Hall 4 (Phoenix Convention Center - West and North Buildings)
Cassandra Chang, College of William and Mary, Los Angeles, CA; and S. abuomar

Oceanographic data from devices such as the Acoustic Doppler Current Profiler (ADCP) used in this study are high in dimension and are intensive in processing and interpretation. Data mining techniques have proven useful in various applications to these types of datasets for ease and depth of data analysis. This study used both unsupervised and supervised machine learning to analyze and model data collected from an ADCP. Principal component analysis (PCA) was applied to reduce the dimensionality of the data for visualization and cluster analysis. The main common features in both datasets included physical and chemical properties, such as temperature, location, and error velocity. Similarities in the common features driving the formation of the clusters show that PCA was able to consistently identify the most important features in the data. Support vector machines (SVM) using various kernel functions and constant values were extremely accurate in organizing the data into classes defined by the transect it was collected from, with the dot product and polynomial kernel functions having the highest classification accuracy overall. These machine learning techniques were successful in the analysis of ADCP data and may be applied to other similar oceanographic datasets in future studies.
- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner