4.4 Empowering Data Management, Diagnostics, and Visualization of Cloud-Resolving Models (CRMs) via a Cloud Library based upon Spark and Hadoop

Wednesday, 25 January 2017: 2:15 PM
3AB (Washington State Convention Center )
Wei-Kuo Tao, NASA, Greenbelt, MD; and X. H. sun, S. zhou, T. Matsui, and X. Li

A Super Cloud Library (SCL), capable of cloud-resolving model (CRM) database management (IO control and compression), distribution, visualization, subsetting, and evaluation, is being developed at Goddard.  The SCL architecture is built upon a Hadoop framework.  The Hadoop distributed file system (HDFS) is a stable, distributed, scalable and portable file-system.  The Hadoop framework supports streaming, which enables 2D and 3D visualization via IDL code.  Furthermore, Hadoop R enables various standard/non-standard statistics and their visualization.  Within the Hadoop framework, a CRM’s diagnostic capabilities is being further enhanced with Spark, which is built on top of HDSF and accelerates the Hadoop MapReduce process by ~100 times.

The SCL has two types of CRM simulations:  from the NASA-Unified Weather Research and Forecasting (NU-WRF) and Goddard Cumulus Ensemble (GCE) models.  Its users can conduct large-scale on-demand tasks automatically, without the need to download voluminous CRM datasets and various observations from NASA field campaigns and satellite data to a local computer.  In this talk, we will present highlights from extra high resolution (up to 250 m), large domain (up to 4096 x 4096 x 106 grid) simulations in terms of subsetting, visualization and diagnostics.

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner