813 Deploying and Computing Hydroclimate Data and Services in AWS Govcloud

Thursday, 1 February 2024
Hall E (The Baltimore Convention Center)
Harsh Patel, U.S. Army Corps of Engineers (USACE) CTR, McLean, VA; and M. sant-Miller and R. Harris

AWS GovCloud is an integral part of delivering the latest climate data to decision-makers and the public. The AWS suite provides access to high-compute cloud resources for data preprocessing and streamlines the development and deployment of web applications. Web applications constitute one of the most common approaches to making climate data widely available and accessible, but complexities arise when considering scalability, performance, security, and reliability. AWS aims to serve as interoperable building blocks to address these complexities while allowing maximum customization of technology stacks based on user needs.

The United States Army Corps of Engineers (USACE) has developed three climate web applications, CHAT, TST, and SLAT, that each aim to solve distinct climate data challenges. All the tools use AWS GovCloud, but each introduces subtle differences to handle its data sources. CHAT, the Climate Hydrology Assessment Tool, is an R Shiny application that visualizes and analyzes streamflow, precipitation, and temperature projections from the Coupled Model Intercomparison Project (CMIP) at the stream segment level. At this resolution, the CHAT database contains multiple data tables consisting of tens to hundreds of millions of rows. Due to the vast amount of data, CHAT leverages AWS such as AWS S3, AWS EC2, and AWS RDS for big data management and cloud computation. Migration to these services improved the quality and reliability of preprocessed data and performance of data retrieval for visualization. In contrast, TST, the Time Series Toolbox, performs real-time analysis on time series data through a suite of statistical methodologies. The tool has two options for data sources: preloaded streamflow gage data or user-uploaded time series data. For both, TST combines an R Shiny backbone with AWS EC2 compute resources for the intensive, real-time statistical analysis performed within the tool. Users can perform quick, repeatable analysis on different custom time series datasets, supporting synthesis and comparison of results. SLAT, the Sea Level Analysis Tool, draws most of its data from NOAA’s public Tides & Currents API and, unlike the TST, does not require highly sophisticated data science tools. Accordingly, SLAT uses React instead of R Shiny, which helps enhance performance and scalability. The React application is deployed using AWS ECS, which streamlines the release of new versions by automated deployments to the cloud and has built-in capabilities to autoscale based on network traffic to ensure concurrent users can access the data and application with no impact on performance.

This poster will cover key differences between these AWS GovCloud models and highlight their benefits and tradeoffs. Diagrams representing the architecture used by each application will illustrate the key decision points that emerge in designing and deploying cloud-based tools. Such practical applications and models are expected to help other practitioners select optimal technology stacks and infrastructure setup for their own tools.

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner