Data management and transport infrastructure for the SCOOP project
Gerald J. Creager, Texas A&M Univ., College Station, TX; and D. Cote, M. Smith, J. MacLaren, T. Yoksas, S. Emmerson, S. R. Chiswell, and P. Bogden
SURA (Southeastern Universities Research Association) comprises some 60 education and research entities across the Southeastern United States. While SURA is known for its expertise in high energy physics and information technology, a number of member-institutions are also active in ocean observing and prediction projects with a variety of funding entities. SURA's Coastal Ocean Observing and Prediction (SCOOP) Program was initiated to bring together the ocean domain scientists and the information technology scientists to facilitate a program in prediction and observation that would enable a best-of-breed program in ocean observing and prediction by coupling the ocean scientists with experts in distributed (Grid) computing, data management, data transport, visualization, and metadata management. The SCOOP project is a partnership of eight primary partners spanning the coasts of the Gulf of Mexico and the Atlantic Ocean, with a particular focus on the coastlines of the Southeastern U.S. Over the last two years, the SCOOP partners have evolved a complex cyber infrastructure to achieve the aforementioned goals and objectives.
SCOOP was first funded in 2003 as a prototype to examine distributed numerical modeling processes, both in closely coupled computational clusters and geographically diverse nodes using virtualization services. In 2004, additional information technology was added to bolster the more conventional Grid computing concepts as well as enhancing the standards work in both data and metadata. At the same time, a regional archive system was proposed to allow data sharing among the partners, and a data transport plan was suggested to reduce workload and duplication of effort by modelers who were seeking common data to initialize numerical ocean models.
Archive systems work was initiated for the 2004 Atlantic Hurricane Season. Estimates were solicited from the SCOOP partners, and two sites, Louisiana State University (LSU) and Texas A&M University (TAMU) agreed to provide archive services. Initial estimates provided by the modeling contingent suggested 1-2 terabytes (TB) of data would be generated over the course of the hurricane season. Both sites entered the season with sufficient media capacity to fulfill SCOOP archiving requirements based on the early estimates. By the end of an unusually busy hurricane season (which extended into January 2006), Texas A&M was providing a total of 7 TB of disk storage for SCOOP activities with expansion plans; LSU had undertaken agreements with the San Diego Supercomputing Center (SDSC) to utilize up to 7 TB of SDSC Storage Resource Broker resources to facilitate SCOOP requirements. LSU also began investigations using Grid-toolkit methods for archiving and distributed storage.
SCOOP partners had gained significant experience in the use of various catalog services including THReDDS and OpENDAP, and Live Access Server to facilitate data discovery, subsetting, and access/transport. They also employed web-scraping and file transport protocol methods to acquire data for model initialization, data assimilation and model verification. Beginning in 2004, partners with experience in the Unidata Internet Data Distribution (IDD) system and the underlying Local Data Manager (LDM) software promoted the use of LDM as a transport for datasets between partners. An ad hoc point-to-point IDD was established and the results of model outputs as well as initialization data were transported to the Partners using this system. A dedicated SCOOP Catalog service was also designed and implemented. This presentation will identify the successes and failures encountered and the process used to improve performance in data management and transport.
The NOAA Coastal Services Center and the Office of Naval Research provide funding for the SURA/SCOOP Program. The following institutions are collaborators within the SCOOP Program: Bedford Institute of Marine Science, the Gulf of Maine Ocean Observing System, Louisiana State University Center for Computation and Technology, MCNC, Renaissance Computing Institute, Texas A&M, the University of Alabama in Huntsville, the University of Florida, the University of Miami Center for Southeastern Tropical Advanced Remote Sensing, the University of North Carolina, and the Virginia Institute of Marine Science..
Session 9A, Internet Applications and Cyberinfrastructure
Thursday, 18 January 2007, 1:15 PM-3:00 PM, 216AB
Browse or search entire meeting
AMS Home Page