Tuning HDF5 (and NetCDF-4) Applications to Overcome and Avoid Compression Pitfalls

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner
Tuesday, 6 January 2015: 9:15 AM
131C (Phoenix Convention Center - West and North Buildings)
Larry Knox, The HDF Group, Champaign, IL; and E. Pourmal and A. Cheng

Compression of large datasets in S-NPP, JPSS and other HDF5 data files may substantially reduce their size, in turn reducing disk space requirements and file download time. However, mismatches between the layout of the files' datasets, the HDF5 instance's cache settings, and the access pattern of applications using the data can sometimes result in poor performance or exhausting machine resources when running the application. Whether designed in advance or modified in response to encountered problems, applications can be tuned to optimize efficiency of data access, avoid unnecessary repeated decompression, and reduce the amount of memory used. Examples will be given of problems that may be encountered, how to use available tools to diagnose or work around them, changing cache settings to conserve memory, and designing access strategies to avoid both performance and memory issues when creating or modifying applications.