1.2 The Big Climate Data Pipeline (BCDP): An Open Source Python Library to Analyze High-Resolution Climate Models and Satellite Observations in Amazon Cloud and NASA’s High-Performance Computing Environments

Monday, 13 January 2020: 9:15 AM
157AB (Boston Convention and Exhibition Center)
Alexander Goodman, Jet Propulsion Laboratory, Pasadena, CA; and H. Lee and K. Gorski

Evaluations of climate models with respect to observations remains an important task for many climate assessment reports such as the National Climate Assessment (NCA). Jet Propulsion Laboratory (JPL) has previously contributed to these assessments through the Regional Climate Model Evaluation System project (RCMES) which facilitated the development of an open source python library, the Apache Open Climate Workbench (OCW) in 2013. Utilizing python libraries from the pydata stack such as xarray and dask, we have since created a more modern replacement: the Big Climate Data Pipeline (BCDP), which provides a more flexible API for configuring climate model evaluations and inherently supports Big Data use cases. We have also demonstrated that for single-threaded use-cases with monthly regional climate simulations, BCDP provided significant performance benefits over OCW due to utilizing xarray’s lazy-evaluation data model. While many evaluations involve very simple intercomparison metrics such as model bias and root mean square error (RMSE), BCDP also supports seamless integration of custom use cases by providing an extensible object-oriented API for each component of the data processing pipeline. We will demonstrate this capability by examining a use case which requires remapping high-resolution climate simulations of precipitation (NASA Earth Exchange Downscaled Climate Projections 30, NEX-DCP30) defined on standard latitude/longitude grids to Hierarchical Equal Area isoLatitude Pixelization (HEALPix) grids. Results will be shown from evaluation runs of BCDP deployed in both cloud (dask-kubernetes cluster on AWS-EC2 instances) and HPC (the NASA Pleiades Supercomputer) environments.
- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner