The Big Weather Web (BWW; http://www.bigweatherweb.org) is an NSF-funded multi-university effort whose primary goal is to develop a common and sustainable Big Data infrastructure in support of weather prediction research and education in universities. Currently, seven university partners (Colorado State University; Pennsylvania State University; South Dakota School of Mines & Technology; Texas Tech University; University at Albany-SUNY; University of North Dakota; and University of Wisconsin-Milwaukee) generate NWP model output with the Advanced Research Weather Research and Forecasting (WRF) model. While the model domain and run-time are fixed, each university node runs one or more unique instance of WRF. Specifically, each run may use a different set of physics, convective, boundary-layer, and/or radiative parameterization schemes, or use a different set of initial conditions. As a result, a moderately-large (currently 47-member) ensemble of CONUS-centered WRF forecasts produce model output every 3 hours up to the model run time of 84 hours.
Model output is uploaded from each university to Amazon Web Services (AWS) Simple Storage Service (S3). Then, using AWS’s Elastic Compute Cloud (EC2), the output is postprocessed using NOAA’s Developmental Testbed Center’s (DTC) Universal Postprocessor (UPP). The postprocessed data is shared via the OpenDAP data access protocol, using Unidata’s Thematic Real-time Environmental Distributed Data Services (THREDDS). THREDDS is easily deployed as a Tomcat-served webapp on the EC2 instance.
Besides AWS, the BWW now has cloud resources provided by NSF’s Extreme Science and Engineering Discovery Environment (XSEDE). A virtual, cloud-based compute server has been deployed with OpenStack. XSEDE also provides access to the Texas Advanced Computing Center’s (TACC) Wrangler cloud-based storage system. BWW postprocessed model output is uploaded to Wrangler, and is mounted on the OpenStack server. A key part of this presentation will focus on the steps taken to deploy a containerized (via Docker) version of DTC’s Model Evaluation Tools (MET) on the OpenStack server, which is then used to perform verification studies on the post-processed model output stored on Wrangler.