2.3 Enabling End-to-End Climate Science Workflows in High Performance Computing Environments

Thursday, 14 January 2016: 11:30 AM
Room 344 ( New Orleans Ernest N. Morial Convention Center)
Harinarayan Krishnan, LBNL, Berkeley, CA; and B. Loring, S. Byna, M. F. Wehner, T. A. O'Brien, M. Prabhat, C. Paciorek, and D. Stone

A typical climate science workflow often involves a combination of acquisition of data, modeling, simulation, analysis, visualization, publishing, and storage of results. Each of these tasks provide a myriad of challenges when running on a high performance computing environment such as Hopper or Edison at NERSC. Hurdles such as data transfer and management, job scheduling, parallel analysis routines, and publication require a lot of forethought and planning to ensure that proper quality control mechanisms are in place. These steps require effectively utilizing a combination of well tested and newly developed functionality to move data, perform analysis, apply statistical routines, and finally, serve results and tools to the greater scientific community.

As part of the CAlibrated and Systematic Characterization, Attribution and Detection of Extremes (CASCADE) project we highlight a stack of tools our team utilizes and has developed to ensure that large scale simulation and analysis work are commonplace and provide operations that assist in everything from generation/procurement of data (HTAR/Globus) to automating publication of results to portals like the Earth Systems Grid Federation (ESGF), all while executing everything in between in a scalable environment in a task parallel way (Threads/MPI).

In this presentation, we will highlight the capabilities of our workflow suite to: efficiently move data from a variety of input sources while managing quota constraints; execute a parallel pipeline of modeling, analysis, and statistical routines operating under different programming environments (Python, R, C++) while considering data movement; and finally provide capabilities for publishing results back to the greater climate science community. We illustrate the benefit of these tools by showing several climate science analysis and modeling use cases they have been applied to.

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner