92nd American Meteorological Society Annual Meeting (January 22-26, 2012)

Wednesday, 25 January 2012: 4:15 PM
From Sensor to Archive: Data Flow, Tools and Management of Observational Data at NCAR's Earth Observing Laboratory
Room 348/349 (New Orleans Convention Center )
Michael D. Daniels, NCAR, Boulder, CO; and C. Martin, G. Stossmeister, and S. Williams

The Computing, Data and Software (CDS) Facility of NCAR's Earth Observing Laboratory (EOL) is responsible for developing and maintaining state-of-the-art data services for the suite of observing platforms that EOL provides to the atmospheric sciences community. Science derived from observing platform data is often the result of years of planning before a field deployment and millions of dollars are spent on the deployment itself in hopes of obtaining the desired dataset. Today, these services include interactive collaboration tools and real-time streams of observing platform data and data from other sources, which has facilitated and encouraged wider participation among researchers. During field experiments, experts look at our data and compare it with other measurements, notifying instrument operators of data quality problems so that they can be addressed very rapidly, even during a research flight. The services themselves have become a complex web of interconnected systems, communications infrastructure, and data streams which CDS monitors during field operations, often through 24/7 shifts using sophisticated online tools to alert engineers of potential problems before they escalate.

Once the field phase is completed, tools and procedures are used during the collection and processing of datasets, recording each dataset step and revision so that the data quality process is well tracked and documented. The EOL Metadata Database and Cyberinfrastructure (EMDAC) system is a key to this process as it provides an integrated set of tools used to manage and maintain data and metadata including ingest, processing, quality assurance, long-term archival, and community access to data housed at EOL as well as at distributed multi-agency data portals. This is especially important because data volumes, the need for immediate access, and the complexity of datasets have continued to increase. Data that are streamed in real-time across the Internet and the inclusion of supplementary, multi-agency and international data to augment field campaigns are now the status quo.

CDS emphasizes data stewardship by maintaining integrity throughout all phases of the data lifecycle including planning, acquisition, processing, curation and access. An overview of EOL data flow, tools, challenges, and future directions will be presented.

