Tuesday, 30 January 2024: 5:45 PM
338 (The Baltimore Convention Center)
NOAA’s Center for Operational Oceanographic Products and Services (CO-OPS) manages over 250 water level stations across the United States and its territories that collect high-quality, continuous 6-minute water level data. The quality control (QC) of this data is challenging to automate because data can be impacted by a range of inevitable issues including sensor damage, communication problems, and extreme weather events. CO-OPS has implemented some statistical QC procedures to flag suspect or bad data, but ultimately our methods still require a significant amount of human investigation. Furthermore, once data is identified as bad, considerable effort is required to correct the observations using backup sensor data, nearby stations, tide predictions, statistical fit (i.e., least squares), or some combination of these methods. In this project we train an AI model to predict data as good or bad using the original (raw) water level data and the human-reviewed quality-controlled (labeled) data generated by CO-OPS. We build on our previous success of using an artificial neural net (ANN) model with seven inputs and two hidden nodes to accurately predict bad data on an initial set of five water level stations in the northeast US. Here we demonstrate ANN model accuracy when predicting bad water level data across our water level station network using data from 50 representative stations across the continental US, Alaska, Hawaii, and Caribbean Islands. Furthermore, we found a 10% increase in model accuracy when utilizing geographic region-specific models (Gulf of Mexico, Northeast US, etc.), instead of the original generalized AI model calibrated for all locations. Higher accuracy is found in regions that are tidally-dominant. We also explore methods such as ANNs, long short-term memory (LSTM) networks, and other recurrent neural networks (RNNs), to correct the bad water level data. An AI-based water level QC system will substantially reduce the hours required to produce high-quality water level time series. This project will also fill a critical need in the water level monitoring community by providing a new cost effective method for external partners and users who have limited resources to quality control their data and who do not have the large labeled data sets that have been created by CO-OPS over many years.

