An architecture for the LEAD data repository
LEAD presents unique challenges integrating large data volumes from real-time observational systems as well as those that are dynamically created during the execution of adaptive workflows. The LEAD Data Repository, which manages these data, must be able to autonomously handle storage and retrieval requests generated by the LEAD orchestration in addition to directly satisfying user requests.
This paper will outline an architecture being developed at the Unidata Program Center (UPC) to support the data storage requirements of the LEAD Data Subsystem. This architecture defines a set of simple interfaces to handle the responsibilities of a potentially complex data repository. It will support capabilities such as acquiring storage resources, moving data, generating metadata and unique IDs, cataloging, managing the data, providing a discovery mechanism (via browsing or queries), and providing transparent data access by LEAD services, users, and other applications.
A prototype implementation is being developed based on UPC technologies such as THREDDS catalogs and the Common Data Model and is being integrated with other LEAD technologies. The architecture is designed to allow developers to implement each interface using the most practical solution available as opposed to adopting a large turnkey solution. Various implementations will be produced by the LEAD project and made available as pluggable modules to benefit the community.
Supplementary URL: http://www.unidata.ucar.edu/projects/LEAD/ThreddsDataRepository.html