Tuesday, 8 January 2019
Hall 4 (Phoenix Convention Center - West and North Buildings)
Jerry Bieszczad, Creare LLC, Hanover, NH; and M. Shapiro, D. Entekhabi, D. R. Callender, J. Milloy, D. Sullivan, and M. P. Ueckermann
PODPAC, the Pipeline for Observation Data Processing Analysis and Collaboration, is a Python-based software library for automated data harmonization and seamless transition to cloud processing. Data sources encapsulated by PODPAC are automatically projected and interpolated to a user-specified geospatial reference system. This allows plug-and-play development of processing pipelines using multi-scale and multi-source data. While these pipelines may be developed in Jupyter notebooks on local machines, they can also be exported using an automatically-generated text-based description and run on massively distributed remote cloud servers. PODPAC is under development under a permissive open source license, and is available at
This paper demonstrates PODPAC usage through example applications combining multiple data sources and running on the AWS commercial cloud. In particular, we will show applications involving: NASA observational data products such as SMAP (Soil Moisture Active-Passive); distributed sensor networks such as the COSMOS soil moisture networks; and digital terrain model data. These data sources will be encapsulated using PODPAC to demonstrate the automated data harmonization features. We will also show how to generate a PODPAC processing pipeline, and then execute it both locally and remotely using AWS Serverless Lambda functions. A description of this cloud-based architecture will also be presented. Finally, we will describe our progress, as well as the planned development goals for the PODPAC software.
Supplementary URL: https://podpac.org

