310635 Standardized Real-Time Satellite and Atmospheric Data Processing within a Task Queue Automation System and Framework Utilizing Software Design Patterns, Reflection, and Plugin Architecture for Derived and Fused Dataset and Imagery Generation

Tuesday, 24 January 2017
4E (Washington State Convention Center )
Scott Longmore, CIRA/Colorado State Univ., Fort Collins, CO; and S. D. Miller, C. J. Slocum, P. T. Partain, and J. Fluke

As the spatial and temporal resolution of satellite sensors increases along with the number of instruments, channels, and datasets from upcoming platforms (e.g. GOES-R) the need to process and generate derived and fused datasets and imagery, quickly and efficiently in real-time has also increased. Designing and implementing generalized, configurable, and scalable software utilizing advances in computer software design methodologies is one approach to accommodate these increases.

Colorado State University’s Cooperative Institute for Research in the Atmosphere (CIRA) has recently developed and successfully implemented a standardized real-time task queue automation system and framework in the Python language. This system creates, executes, and manages satellite and atmospheric data processing tasks utilizing producer/consumer, strategy, façade, and command design patterns. Python’s reflection capability is used to import task plug-in modules at run-time from command line specified JavaScript Object Notation (JSON) configuration files and to execute task producer and consumer methods within the task plug-in modules. Task producer methods compile information (parameters, meta-data, data files, etc.) into task data structures while task producer methods perform a task with the given task data structure. Tasks’ consumer methods are facades for any type of algorithm, from downloading, converting and formatting data, running a data fusion program, running a model, or generating imagery, etc. Using the strategy design pattern, all task plugin modules’ producer and consumer methods implement the same interfaces ensuring a consistent behavioral contract between the task queue program and the task plug-in methods. A command method is available to task consumer and producer methods for externally called algorithms or models. Multiple instances of the system are being run simultaneously with different task configurations on multi-processor servers which has increased the efficiency of data processing. A variant of the system utilizing Python’s multi-processing modules for parallel task execution has been developed and implemented as well. The task queue automation system was first implemented in September 2015, and since then has been successfully implemented for 8 project’s tasks. Utilizing this task plug-in module producer and consumer framework has also significantly reduced the development and testing time for automating real-time derived and fused datasets, products, and imagery.

Short term improvements include refactoring the system to run continuously as a daemon, and implementing the system within Docker containers and other virtual machine technologies. Long term plans include researching and refactoring the system to incorporate distributed task management capabilities such as Celery, so that tasks can be queued and executed on remote servers and virtual machines. Encapsulating with virtual machines and parallelizing task execution through distributed processing will allow processing servers to be used more efficiently, increasing the speed at which satellite and atmospheric data can be processed.

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner