7.1 Computational Data Analysis in the era of Peta-Scale Digital Archives: Advancing Web Services to Reduce Data Access Latencies

Wednesday, 9 January 2013: 8:45 AM
Room 12B (Austin Convention Center)
Glenn K. Rutledge, NOAA/NESDIS/NCDC, Asheville, NC; and D. Williams and J. J. Hnilo

In the era of peta-scale multi-model General Circulation Model (GCM) and Numerical Weather Prediction (NWP) data archives, the need for on-demand computational resources and user driven services has never been more apparent. There is a long recognized value in society to begin to develop cross-discipline science studies linking data across massive databases to provide products for the attainment of knowledge and not only access to the raw data. Can users continue to be asked to download peta-bytes of data? What should be the role of the archive in the 21st century under these new and growing massive databases? How can the archive community begin to provide not only distributed access, but distributed web services capabilities from a semantic-web standpoint? This research outlines some of the existing limitations of traditional archives and explores the many benefits of providing archive-based computational resources on peta-scale databases from a still emerging web services based viewpoint. While computational resources for users on demand has been available for several decades under the name of Grid Computing (Foster and Kesselman, 2004) and secure cloud computing is gaining popularity, on-demand web services for climate and weather products have been slow to implement. Archive centers must focus on the role of an archive and resources for such high volume computations are limited and seemingly outside the scope of their primary mission. It will be shown herein that common web serviced based products and tools that will eventually be a cost effective distributed approach to provide user services will be to be advanced. For climate and weather model users these might include: With peta-scale archive of climate and weather models, traditional access methods as well as distributed data access and federated frameworks such as NOMADS and ESGF today are under heavy burden to provide raw data and even subsets of climate and weather model data given an increase of users and the need for climate mitigation and impact information. Computational analysis such as Grid technologies have not yet been implemented in many organizations. Pre-staged aggregations of common state variables and web-based climate service tools and analytic engines such as CDAT can provide a path to satisfy many users of high volume model archives.
- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner