J12.2 The CUAHSI Water Data Center: Integrating Standards, Data, and Software to Empower Scientists to Discover, Use, Store, and Share Water Data

Thursday, 10 January 2013: 1:45 PM
Room 12A (Austin Convention Center)
Alva L. Couch, CUAHSI = Consortium of Universities for the Advancement of Hydrologic Sciences, Inc, Medford, MA; and R. P. Hooper, J. S. Arrigo, and J. Pollak

The proposed CUAHSI Water Data Center (WDC) will provide production-quality water data resources based upon the successful large-scale data services prototype developed by the CUAHSI Hydrologic Information System (CUAHSI HIS) project. In putting CUAHSI HIS into production, the WDC continues to concentrate upon providing time series data collected at fixed points or on moving platforms from sensors primarily (but not exclusively) in the medium of water. The WDC's missions include providing simple and effective data discovery tools useful to researchers in a variety of water-related disciplines, and providing simple and cost-effective data publication mechanisms for projects that do not desire to run their own data servers. Accordingly, the WDC's activities will include:

1. Rigorous curation of the water data catalog already assembled by CUAHSI HIS, to ensure accuracy of records and existence of declared sources.

2. Data backup and failover services for “at risk” data sources, such as those for smaller projects. 3. Creation and support for ubiquitously accessible data discovery and access, including web-based search and smartphone applications.

4. Partnerships with researchers to extend the state of the art in water data use.

5. Partnerships with industry to create plug-and-play data publishing from sensors, and to create domain-specific tools of interest to commercial entities.

At a deeper level, the WDC will serve as a knowledge resource for researchers of water-related issues, and will interface with other data centers to make their data more accessible to water researchers. Thus the WDC will serve as a vehicle for addressing some of the grand challenges of accessing and using water data, including:

a. Cross-domain data discovery: different scientific subdomains refer to the same kind of water data using different terminologies or ontologies, making discovery of data difficult for researchers outside the data provider's subdomain.

b. Cross-validation of data sources: much water data comes from sources lacking rigorous quality control procedures; such sources can be compared against others with rigorous quality control. The WDC enables this by making both kinds of sources available in the same search interface.

c. Data provenance: the appropriateness of data for use in a specific model or analysis often depends upon the exact details of how data was gathered and processed. The WDC will aid this by curating standards for metadata that are as descriptive as practical of the collection procedures. “Plug and play” sensor interfaces will fill in the metadata appropriate to each sensor without human intervention. d. Model sharing: expressing hydrologic models in ways that are independent of programming language and platform. The WDC will curate models in OpenMI format.

e. Contextual search: discovering data based upon geological (e.g. aquifer) or geographic (e.g., location in a stream network) features external to metadata. The WDC will partner with researchers desiring contextual search, and then make the results available to all.

f. Data-driven search: discovering data that exhibit quality factors that are not described by the metadata. The WDC will partner with researchers desiring contextual search, and then make the results available to all.

Many major data providers (e.g. federal agencies)do not have the mandate to provide access to data other than those they collect. CUAHIS HIS has assembled data from more than 90 different sources, thus demonstrating the promise of this approach. Meeting the grand challenges listed above will greatly enhance scientists' ability to discover, interpret, access, and analyze water data from across domains and sources to test Earth system hypotheses.

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner