4.4 The Big Climate Data Toolkit: A Modern Software Stack for Evaluations of High-Resolution Climate Datasets

Tuesday, 8 January 2019: 9:15 AM
North 129B (Phoenix Convention Center - West and North Buildings)
Alexander Goodman, Jet Propulsion Laboratory, Pasadena, CA; and H. Lee

As part of an effort related to our ongoing National Climate Assessment (NCA), JPL has lead the development of the Regional Climate Model Evaluation System (RCMES), an open-source Python software tool, powered by the Apache Software Foundation’s open climate workbench (OCW), for evaluating regional climate models (RCMs) with respect to observations (eg, obs4MIPs satellite data). While OCW has proven robust for the scientific use cases by JPL developers and their collaborators, the software development landscape has greatly changed as a result of a push for Big Data analytics. The capability to analyze Big Data has become especially relevant in climate science due to increases in both spatial and temporal (monthly to daily or even hourly) resolution. In turn, this leaves the current state of OCW somewhat inefficient since it had been primarily developed for sequential workloads evaluating simulated monthly data against monthly observations. Additionally, much of the underlying software is outdated, with most dataset processing being performed exclusively using the numpy and scipy (numpy.scipy.org) Python libraries. To address these issues, we are developing a successor to the OCW library which is tentatively called the big climate data toolkit (BCDT). BCDT will primarily feature a transition from vanilla numpy arrays to labeled xarray data arrays (xarray.pydata.org). This will significantly simplify our dataset processing application programming interfaces (APIs), since xarray arrays have built-in temporal resampling and group by operations. It also integrates directly with the dask distributed computing library and thereby opens up the possibility for parallel and out of core workloads. To demonstrate the benefits of these changes, we benchmarked a simple evaluation featuring a suite of RCMs from the Coordinate Regional Downscaling Experiment (CORDEX) and obs4MIPs satellite data.
- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner