J12.4 The Algorithm Workbench: A Scalable Systems Toolkit Using Shared Algorithm Components

Thursday, 14 January 2016: 2:15 PM
Room 252/254 ( New Orleans Ernest N. Morial Convention Center)
Alexander Werbos, AER, Lexington, MA; and D. B. Hogan, D. Hunt, E. Steinfelt, and T. S. Zaccheo

In this work, we present the Algorithm Workbench: a software system that allows users to develop, analyze, and execute data processing algorithms using multiple languages, frameworks, and data sources. Building on the advanced data model concepts developed in the GOES-R program, the Algorithm Workbench (AWB) is capable of unifying algorithms from multiple sources into a live processing system, formatting their data into a common structure, and producing metadata-enhanced outputs that can be used and visualized in a variety of off-the-shelf or custom tools. The Algorithm Workbench provides an end-to-end solution for the pre-processing of inputs, generation of data, and output formatting, but it is built on a set of open interfaces that allow users to easily replace or supplement any software element to ensure that it can be integrated into their own system. In this approach, the algorithms themselves are fully encapsulated components that can be automatically scheduled and executed based on their associated metadata. This flexibility, coupled with the included tools for system engineering and management, make the Algorithm Workbench capable of supporting user needs across the activities involved in research, operations, and transitioning between the two. In this presentation, we will describe the architecture of the Algorithm Workbench, and how it facilitates the development of user-owned, component-based systems supporting both small-scale processing and enterprise ground architectures.

The Algorithm Workbench provides a common execution interface for algorithms, and any software component meeting this interface can be seamlessly integrated with all other AWB-compatible algorithms. Using the provided tools and patterns that have already adapted algorithms from the GOES-R baseline, GOES-R Option 2, and the Community Satellite Processing Package (CSPP) into this interface, users can easily add their own algorithms that can immediately draw from the outputs of the existing ones.

The Algorithm Workbench is designed for scalability to meet diverse user needs. Each algorithm is parameterized in its execution region to allow the infrastructure configuration to completely dictate processing blocks, number of parallel processes, cadence, and coverage area. By adjusting these parameters, or using the included tools that automatically determine efficient values, users can scale the data processing from laptops to high-performance servers, and even multiple cloud-processing nodes.

The configurations used by the Algorithm Workbench are stored in a component known as the Algorithm Descriptor Database (ADDB). The ADDB employs a three-layered architecture to store metadata about algorithms, data, and system configurations in programmatically-accessible fashion. These layers are Science, Configuration, and Instance. The Science layer is the lowest layer, and contains immutable information about algorithm and data properties that are specified in abstract terms, such as required inputs and generated outputs. The Configuration layer is built atop the Science layer and describes assemblies of particular algorithms, data importers, and output formatters into a processing system. The Instance layer contains bindings of specific data and resources listed in the Configuration layer to file and host names that allow the system to run in a particular processing environment. Since each layer of the Algorithm Workbench configuration is independent of the ones above it, Configurations can be developed on individual workstations, and migrated directly to more powerful systems for live execution. This enables the creation of a very tight iterative loop in system development where new systems, and updates to existing ones, can be built and tested in a simplified development environment, and placed without change into full production environments.

The Algorithm Workbench provides a complete solution for algorithm processing, and is currently capable of running algorithms from multiple operational teams on data from a variety of sources. Built from the ground up with a philosophy of open interfaces, it is designed to drive powerful user-owned processing systems, ranging from research-focused offline development, to near real-time enterprise ground system operations.

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner