Distributed Computation Resources for Earth System Grid Federation (ESGF)
The Sixth Assessment Report (AR6) is already in the planning stages and is estimated to create as much as two orders of magnitude more data than the AR5 distributed archive. It is clear that data analysis capabilities currently in use will be inadequate to allow for the necessary science to be done with AR6 data—the data will just be too big. A major paradigm shift from downloading data to local systems to perform data analytics must evolve to moving the analysis routines to the data and performing these computations on distributed platforms. In preparation for this need, the ESGF has started a Compute Working Team (CWT) to create solutions that allow users to perform distributed, high-performance data analytics on the AR6 data. The team will be designing and developing a general Application Programming Interface (API) to enable highly parallel, server-side processing throughout the ESGF data grid. This API will be integrated with multiple analysis and visualization tools, such as the Ultrascale Visualization Climate Data Analysis Tools (UV-CDAT), netCDF Operator (NCO), and others.
This presentation will provide an update on the ESGF CWT's overall approach toward enabling the necessary storage proximal computational capabilities to study climate change using the AR6 extreme scale distributed data archive. An update on the API will be provided along with a survey of the overall computational approaches being reviewed and studied by the members of the ESGF CWT.