1039 ARM Visualization Tools for Ensuring Data Quality and Consistency and Managing Data Processes

Wednesday, 25 January 2017
4E (Washington State Convention Center )
Krista L. Gaustad, PNNL, Richland, WA; and S. Beus and J. W. Monroe

The Atmospheric Radiation Measurement (ARM) program collects observations pertinent to atmospheric processes that are used by the atmospheric community to increase the understanding of cloud properties and that can be used to reduce uncertainties in climate models. The ARM Program's Data Management Facility (DMF) is responsible for the timely collection, processing, value-added processing, re-processing, and delivery of data products from ARM research sites to the ARM Archive.  This poster presents the visualization tools used by the DMF to evaluate and communicate the quality, completeness, and accessibility of the data and processes that create the data.  The tools discussed include a data quality inspector, a data inventory viewer, a data comparison tool, and a process status viewer.  The data quality inspection tool, “dq_inspector”,  quickly creates user-customizable plots of the data in context of the quality flags that have been set during processing.  These plots allow users to visualize the quality of the data, and review specifics and results of the quality assessments applied.   The data inventory viewer ("Data File Inventory") tool displays ARM’s data catalog on a timeline with color-coded overlays documenting the consistency of the data in the ARM archive relative to the data produced on the ARM data processing center.  It is used by DMF operators to ensure transfer of the data to the archive is done correctly and in a timely manner.  The "ncreview" data comparison tool was developed to expedite the process of validating reprocessing efforts have changed the data in the expected, and only expected manner.  It displays an interactive web-based comparison of two sets of netCDF data files showing differences in both the data and metadata, and uses summary metrics such as minimum, maximum, count missing and invalid values, and standard deviation.  The ARM program manages over 1,1000 processes per hour that ingest the data to a common format and apply additional processing to create data products of higher value.  The "dsview" data system viewer is used by the DMF to simplify the management of ARM's data processes.  It provides color-coded indicators of process state with integrated access to logs and summary reports.  Application of visualization tools such as these have allowed the program to approximately double data throughput over the past five years while also maintaining data accessibility and improving both data quality and reproducibility.  
- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner