824
A Lightweight, Scalable Framework for Remote Sensing Algorithm Design, Development, and Transition to Operations
This software system is comprised of four core components: the Algorithm Descriptor Database (ADDB), the Data Model Interface (DMI), the algorithm architect tool, and the algorithm driver. The ADDB stores programmatically-accessible data regarding algorithm inputs and outputs, data type characteristics, and the configuration of individual processing systems. The ADDB contains information on all algorithm inputs and outputs (Level 1, Level 2, ancillary and auxiliary data) and performance characteristics (CPU, memory and storage requirements). The tools within the system will incorporate this information to create, store, and analyze algorithm configurations, automating many functions that are typically manually performed. The DMI is the standard I/O interface used by algorithms to read and write all algorithm data and allows algorithms to be moved, without change, between any processing environments that implement the interface. Using information from the ADDB, the DMI is able to automatically map between the data types used by the algorithm and those that are present in the processing environment, allowing the algorithm implementation to be decoupled from operational concerns regarding storage precision, spatial and temporal resolution. The algorithm architect tool is a graphical user application that allows the user to visualize the system as they add or configure algorithms, and incorporates libraries that allow the automatic generation of precedence trees. The algorithm driver is a parallel-processing based system that is capable of reading an ADDB system configuration, automatically partitioning the input files for distribution across processing units, and executing the specified algorithm chains to produce the desired outputs. A common use case is the iterative development of a science algorithm in a flexible small scale local computing environment and the transition of the algorithm to a large scale computing environment in another facility or the cloud for testing and production with larger data volumes.
The core components of the system are designed to be interoperable with a variety of off-the-shelf tools for data analysis and distribution. The algorithm driver is capable of generating outputs in a variety of self-documenting formats, including NetCDF and HDF, and can incorporate CF-compatible metadata. By combining the new functionality with tools commonly in use by the remote sensing community, users are able to conveniently assemble, analyze, and maintain complete data processing and distribution systems with only minimal overhead.