Employing Task Parallelism to Facilitate Dynamic Comparison of Model Output

Aizenman, Hannah; Aizenman, Hannah

The analysis of many any climate models, such as the Community Earth System Model (CESM), require extensive and expensive computing and storage resources. Since these resources are often out of the reach of many young scientists and small research groups, the goal of this National Consortium of Atmospheric Research's (NCAR) Computational Information Systems Laboratory's (CISL) Summer Internships in Parallel Computational Science(SIParCS) project was to make a commonly used climate model diagnostic tool available through the web so that users could make use of NCAR resources in a simple and public way. The user facing interface is written in HTML/JavaScript and is built on top of a RESTful API implemented using the Pyramid web framework. An advantage of the Pyramid web framework is its lightweight plug and play architecture which makes it easier to later maintain, customize, and extend the tool as the needs of the user and application grow. The gevent library was used to add in asynchronous task execution so that each execution of the diagnostics could be independent. This will enable web-based real-time monitoring of the status of a diagnostic run. The results of the run are treated as a RESTful resource so that they can be obtained as a compressed archive or file list, for example, using the same URL, and to make it simpler to later integrate a database or some caching scheme. The UI is separated from the server side API to give developers the ability to easily add in new interfaces without destroying functional ones.

A major aspect of building the web interface was rewriting the CESM Ocean Model Working Group's (OMWG) model diagnostics c-shell driver scripts as a Python library. This was done mainly to simplify the process of running the diagnostics with user defined settings, but also to improve their usability, maintainability, robustness, and extensibility. Python was chosen in large part because it has extensive support for managing calls to shell utilities, which was crucial because the diagnostics depend very heavily on numerous shell and ncl scripts that would take an extensive amount of time to convert to Python. The library wraps the Swift parallel scripting language version of the diagnostics so that the NCL scripts at the core of the diagnostics can be run in parallel instead of serially. Functions for creating swift configurations were also built into the library to make working with swift easier. The web version takes full advantage of this by running independently configured jobs on every call and returning the results as either an archive, a styled folder, or a plain folder, depending on the URL. This flexibility was a major design goal in creating this tool, because that makes it easier for users to fit it into their workflow, which is key to any form of widespread adoption of the tool. We created a maintainable and highly documented web interface to the parallel version of CESM-OMWG diagnostics so that researchers outside of NCAR could more easily work with the CESM and so that researchers could extend this tool to other datasets and diagnostics, or build their own, thereby increasing public access to both data and the tools used to understand it.

4.2 Employing Task Parallelism to Facilitate Dynamic Comparison of Model Output