Tuesday, 8 January 2019
Hall 4 (Phoenix Convention Center - West and North Buildings)
Jerry Bieszczad, Creare LLC, Hanover, NH; and M. Shapiro, D. Entekhabi, D. R. Callender, J. Milloy, D. Sullivan, and M. P. Ueckermann
Handout
(762.1 kB)
PODPAC, the Pipeline for Observation Data Processing Analysis and Collaboration, is a Python-based software library for automated data harmonization and seamless transition to cloud processing. Data sources encapsulated by PODPAC are automatically projected and interpolated to a user-specified geospatial reference system. This allows plug-and-play development of processing pipelines using multi-scale and multi-source data. While these pipelines may be developed in Jupyter notebooks on local machines, they can also be exported using an automatically-generated text-based description and run on massively distributed remote cloud servers. PODPAC is under development under a permissive open source license, and is available at
https://github.com/creare-com/podpac.
This paper demonstrates PODPAC usage through example applications combining multiple data sources and running on the AWS commercial cloud. In particular, we will show applications involving: NASA observational data products such as SMAP (Soil Moisture Active-Passive); distributed sensor networks such as the COSMOS soil moisture networks; and digital terrain model data. These data sources will be encapsulated using PODPAC to demonstrate the automated data harmonization features. We will also show how to generate a PODPAC processing pipeline, and then execute it both locally and remotely using AWS Serverless Lambda functions. A description of this cloud-based architecture will also be presented. Finally, we will describe our progress, as well as the planned development goals for the PODPAC software.
Supplementary URL: https://podpac.org
- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner