Content, discovery, and accessibility enhancements to the NCAR Research Data Archive

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner
Thursday, 21 January 2010: 2:30 PM
B217 (GWCC)
Douglas Schuster, NCAR, Boulder, CO; and S. Worley

The Research Data Archive at NCAR is an open access collection with hundreds of datasets that are relevant to climate studies. The RDA encompasses in situ observations, analyses, operational model output, and especially, high-resolution reanalyses; some that are unique in the U.S. and are provided through international agreements and collaborations. Recent system advancements provide much greater access capability for users and scalability to meet current and future challenges in archive growth.

Newly developed data archival, discovery, and access tools comprise the foundation of the system improvements. Together they homogenize data management across the RDA and improve service for users by: supporting daily, weekly, and bulk updates of individual archives to online disk for immediate access; collecting and publishing metadata to enhance user search capabilities; distributing standard metadata to external large-scale search repositories such as the NASA Global Change Master Directory (GCMD) to expand user discovery; integrating file level metadata to better inform user data selection and access; and automatically generating scripts designed to download user selected parameters or multiple files. Additionally, users can now systematically access previously inaccessible data files archived on the NCAR Mass Storage System through an automated web selection interface.

The enhanced functionality has improved user access to the growing and rich content of the RDA. Historical in situ global observation collections are augmented at daily to monthly intervals. Important analyses (e.g. SST) created by numerous providers are refreshed when new versions become available and extended synchronously. Operational analysis and forecast model data are drawn from the real-time Internet Data Distribution system, supported by Unidata, and supplemented with data received directly from ECMWF and high-resolution archives from NCEP. The RDA is particularly strong in offerings from the available reanalyses suite. These include very complete product sets for NCEP/NCAR and NCEP/DOE Global Reanalyses, NCEP North American Regional Reanalysis, 20th Century Reanalysis from NOAA and Cooperative Institute for Research in Environmental Sciences, three reanalyses from ECMWF: ERA-15, ERA-40, and ERA-Interim, and the Japanese 25-year Global Reanalysis. A majority of these reanalyses are ongoing and therefore the data time series are regularly appended.

Complementing improved user accessibility, the application of common tools and methods across RDA operations have a proven new capacity for scalability. From this position, data services can easily be modified and adapted to suit user needs and expectations. For example, support for ensemble weather forecast research through the THORPEX Interactive Grand Global Ensemble (TIGGE) is feasible through the upgraded RDA infrastructure. Here, 250 GB/day of NWP model output is collected daily from 10 international providers, and users are offered historical files or an access interface through which temporal, spatial, and parameter sub-setting as well as re-gridding can be applied. Moreover, new data streams can be easily and economically added to the RDA as was recently done for satellite data that is assimilated into the Weather Research and Forecasting (WRF) model.