323 PyABL and PyABL-HPC: Toolkits for the Parallel Analysis of WRF-LES Big Data Output using Python

Monday, 23 January 2017
4E (Washington State Convention Center )
Timothy S. Sliwinski, Texas Tech Univ., Lubbock, TX; and S. L. Kang

Atmospheric models configured as large eddy simulation (LES) are often employed when it is necessary to resolve the largest, most energy-containing turbulent features of the atmospheric boundary layer (ABL). When grid resolutions on the order of 10s of meters are used, LES of mesoscale domains on the order of 10s of kilometers can easily produce output sizes over 1 TB with a simulation period of about half of one day. These types of simulations are necessary to capture how the turbulent structure of the ABL changes from the early morning through to the late afternoon due to changes in ABL height and the associated scaling. However, with output sizes of this magnitude, the associated task of analyzing and postprocessing the resulting dataset offers a distinct challenge to researchers who may require quick results. In addition, as computing resources continue to increase in power and scale, simulations of these scales are likely to become more common.

Motivated to overcome this issue for our own research and provide a solution to fellow researchers also dealing with it, this work presents two new Python packages that target the boundary layer community called PyABL (pronounced “pie-able”) and PyABL-HPC. PyABL is designed as a standalone set of core utilities that incorporates common routines used for the analysis of the ABL in one easy to use and redistributable package. PyABL provides a set of commonly used analysis techniques such as flux vertical profiles, common methods of determining boundary layer height, and more. PyABL-HPC builds off of the core utilities of PyABL and extends them for usage in parallel environments where high performance computing resources such as Linux-based compute clusters and GPGPU hardware offer opportunities to unlock better performance when dealing with the problem of analyzing these large datasets.

This presentation will showcase the functionality of both PyABL and PyABL-HPC using WRF-LES datasets simulating the ABL over homogeneous surface forcing. These datasets provide various horizontal and vertical grid resolutions ranging from 150m down to 12.5m over a fixed 9km2 domain. By fixing the domain size, increased resolution translates directly into increases in dataset size to over 2TB for the 12.5m case. The parallel performance of the PyABL-HPC package will be assessed by presenting timing results based on a selection of included routines applied over these output datasets and relating them to the serial performance of the PyABL core routines.

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner