As part of the CAlibrated and Systematic Characterization, Attribution and Detection of Extremes (CASCADE) project we highlight a stack of tools our team utilizes and has developed to ensure that large scale simulation and analysis work are commonplace and provide operations that assist in everything from generation/procurement of data (HTAR/Globus) to automating publication of results to portals like the Earth Systems Grid Federation (ESGF), all while executing everything in between in a scalable environment in a task parallel way (Threads/MPI).
In this presentation, we will highlight the capabilities of our workflow suite to: efficiently move data from a variety of input sources while managing quota constraints; execute a parallel pipeline of modeling, analysis, and statistical routines operating under different programming environments (Python, R, C++) while considering data movement; and finally provide capabilities for publishing results back to the greater climate science community. We illustrate the benefit of these tools by showing several climate science analysis and modeling use cases they have been applied to.