Brian McKenna, RPS ASA Brian.McKenna@rpsgroup.com
The wide ranging requirements of computing to support scientific services and research can make the development and maintenance of infrastructure challenging and costly. Numerical simulation and forecasting generally requires a high performance computing environment including high speed, low latency networks supporting parallel execution, whereas the management of output data and analysis of the data requires vast storage space and high memory environments to properly visualize and analyze the largest of datasets.
In 2013, to mitigate challenges of deploying complex infrastructure globally, Amazon’s Web Services (AWS) cloud infrastructure was used to build a cloud based forecast system to schedule routine and on demand forecasts of atmospheric and oceanographic parameters using the WRF atmospheric model, the ROMS hydrodynamic model, and the unstructured SWAN wave model. The ability to utilize cloud offerings only when and where they are needed facilitated a global presence of tools and services without an upfront commitment of maintaining complex infrastructure in multiple data centers.
To complement the cloud-based forecast system, a data management solution providing access and visualization of model and observation data was later modified to be deployable alongside the forecast system. The data management system enables all model data to be presented in community standard formats, such as Climate and Forecast (CF) metadata format. A web based geospatial data analysis application allows users to interactively query model results, observations, and perform robust analytics on the available data.
The on-demand forecast and analysis system not only provided a ‘develop local, deploy global’ solution for our products, but was also seen as potentially providing benefits to our scientific staff. The environment allows an on-demand, rapid execution and analysis of various configurations such as grids, initialization sources and model parameters similar to larger high performance computing (HPC) cluster environments, with no upfront financial commitments. Scientific staff could increase efficiency and spend more time on science rather than managing or waiting for available infrastructure.
An approach similar to the popular agile software development methodology was suggested as a way to utilize the on-demand, scalable cloud environment to build iterative and incremental scientific experiments and products. With this approach, scientists are able to quickly analyze and adapt to incremental results and use the iterative process to review spending more often, ensuring the appropriate efforts and funding needed to complete projects on time and within budget.