Colorado State University’s Cooperative Institute for Research in the Atmosphere (CIRA) has recently developed and successfully implemented a standardized real-time task queue automation system and framework in the Python language. This system creates, executes, and manages satellite and atmospheric data processing tasks utilizing producer/consumer, strategy, façade, and command design patterns. Python’s reflection capability is used to import task plug-in modules at run-time from command line specified JavaScript Object Notation (JSON) configuration files and to execute task producer and consumer methods within the task plug-in modules. Task producer methods compile information (parameters, meta-data, data files, etc.) into task data structures while task producer methods perform a task with the given task data structure. Tasks’ consumer methods are facades for any type of algorithm, from downloading, converting and formatting data, running a data fusion program, running a model, or generating imagery, etc. Using the strategy design pattern, all task plugin modules’ producer and consumer methods implement the same interfaces ensuring a consistent behavioral contract between the task queue program and the task plug-in methods. A command method is available to task consumer and producer methods for externally called algorithms or models. Multiple instances of the system are being run simultaneously with different task configurations on multi-processor servers which has increased the efficiency of data processing. A variant of the system utilizing Python’s multi-processing modules for parallel task execution has been developed and implemented as well. The task queue automation system was first implemented in September 2015, and since then has been successfully implemented for 8 project’s tasks. Utilizing this task plug-in module producer and consumer framework has also significantly reduced the development and testing time for automating real-time derived and fused datasets, products, and imagery.
Short term improvements include refactoring the system to run continuously as a daemon, and implementing the system within Docker containers and other virtual machine technologies. Long term plans include researching and refactoring the system to incorporate distributed task management capabilities such as Celery, so that tasks can be queued and executed on remote servers and virtual machines. Encapsulating with virtual machines and parallelizing task execution through distributed processing will allow processing servers to be used more efficiently, increasing the speed at which satellite and atmospheric data can be processed.