Wednesday, 31 January 2024: 5:15 PM
324 (The Baltimore Convention Center)
Robert C. Jackson, Argonne National Laboratory, Argonne, IL; and A. Sedlacek, A. Theisen, S. M. Collis, M. A. Grover, J. R. O'Brien, Z. Sherman, E. Schuman, R. Records, F. Parry, M. Giansiracusa, and A. Stokes
The Single Particle Soot Photometer (SP2) collects particle-resolved statistics on the mass loading of refractory black carbon (rBC) aerosol and the size distribution of rBC-containing particle ensemble. Due to the very high volume of data that is created with such particle-resolved measurements, a significant amount of data processing is required to generate the end product statistics – individual rBC particle masses and sizes for particle mass and size distributions, respectively. This processing is typically accomplished using an IGOR language-based software package that is provided by Droplet Measurement Technologies (DMT, the SP2 manufacturer). However, the IGOR code is not portable to High Performance Computing clusters such as the U.S. Department of Energy Atmospheric Radiation Measurement (ARM) facility’s Cumulus cluster. Further, given that a single day of SP2 data can span over 10 gigabytes in size, a single field experiment can take several weeks to process on a single machine, limiting the usability of SP2 data to short time periods.
The PySP2 package was developed to overcome these restrictions. It is open source, fully written in Python, and available on conda-forge for easy installation. This also makes it portable and capable of interacting with Dask for parallel processing of SP2 data. In this talk, we will first show a quick tutorial on how to use PySP2. We also demonstrate that, using 468 cores on the ARM Cumulus Cluster, PySP2 can process the entire SP2 dataset collected in Aldine, TX during the ARM TRacking Aerosol Convection intERactions (TRACER) experiment from 1 June to 30 September 2022 in about 8 hours, with serial processing taking over 1 month – demonstrating the power of PySP2 to expand the SP2’s usability by harnessing parallel processing.

- Indicates paper has been withdrawn from meeting

- Indicates an Award Winner