A Lightweight, Scalable Framework for Remote Sensing Algorithm Design, Development, and Transition to Operations

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner
Wednesday, 7 January 2015
Phoenix Convention Center - West and North Buildings
Alexander Werbos, AER, Lexington, MA; and D. B. Hogan, D. Hunt, E. Steinfelt, and T. S. Zaccheo

In this work, we present a set of software interfaces and tools that allow users to effectively create, document, execute, and update software data processing systems. This software is specifically designed to reduce the overhead involved in traditional research to operations activities, and enable direct sharing of science instrument processing algorithms across testing, development and production environments. The interfaces provided allow algorithms implemented in a variety of common programming languages to be adapted for use in the system. This can be done both for single-purpose algorithms seeking to leverage the power of the tools, as well as in the creation of fully-generalized algorithms designed for use across different instrument and platforms. With these tools, users will be able to create multi-algorithm processing systems with complex precedence chains, audit the system for completeness, and execute the specified algorithms using either local or remote computational resources.

This software system is comprised of four core components: the Algorithm Descriptor Database (ADDB), the Data Model Interface (DMI), the algorithm architect tool, and the algorithm driver. The ADDB stores programmatically-accessible data regarding algorithm inputs and outputs, data type characteristics, and the configuration of individual processing systems. The ADDB contains information on all algorithm inputs and outputs (Level 1, Level 2, ancillary and auxiliary data) and performance characteristics (CPU, memory and storage requirements). The tools within the system will incorporate this information to create, store, and analyze algorithm configurations, automating many functions that are typically manually performed. The DMI is the standard I/O interface used by algorithms to read and write all algorithm data and allows algorithms to be moved, without change, between any processing environments that implement the interface. Using information from the ADDB, the DMI is able to automatically map between the data types used by the algorithm and those that are present in the processing environment, allowing the algorithm implementation to be decoupled from operational concerns regarding storage precision, spatial and temporal resolution. The algorithm architect tool is a graphical user application that allows the user to visualize the system as they add or configure algorithms, and incorporates libraries that allow the automatic generation of precedence trees. The algorithm driver is a parallel-processing based system that is capable of reading an ADDB system configuration, automatically partitioning the input files for distribution across processing units, and executing the specified algorithm chains to produce the desired outputs. A common use case is the iterative development of a science algorithm in a flexible small scale local computing environment and the transition of the algorithm to a large scale computing environment in another facility or the cloud for testing and production with larger data volumes.

The core components of the system are designed to be interoperable with a variety of off-the-shelf tools for data analysis and distribution. The algorithm driver is capable of generating outputs in a variety of self-documenting formats, including NetCDF and HDF, and can incorporate CF-compatible metadata. By combining the new functionality with tools commonly in use by the remote sensing community, users are able to conveniently assemble, analyze, and maintain complete data processing and distribution systems with only minimal overhead.