8A.4 Enterprise Data Management (EDM) and Enterprise Product Generation (EPG) Proving Ground in the Amazon Web Services (AWS) Cloud – Final Report

Wednesday, 15 January 2020: 11:15 AM
253B (Boston Convention and Exhibition Center)
Rich Baker, Solers, Greenbelt, MD; and P. MacHarrie, H. Phung, J. Hansford, S. Causey, J. Sobanski, S. Walsh, M. Leach, R. Niemann, and D. M. Beall

As part of the NOAA/NESDIS/OSGS Environmental Satellite Processing and Distribution System (ESPDS) program, Solers was funded to create a proving ground for Enterprise Data Management (EDM) and Enterprise Product Generation (EPG) services in a FedRAMP-approved Amazon Web Services (AWS) cloud environment, leveraging native AWS cloud services and NESDIS product generation algorithms. The EDM and EPG services were derived from capabilities that exist within the existing on-premise ESPDS operational product generation subsystem known as NOAA Data Exploitation (NDE), with refactoring and enhancements to take advantage of native AWS cloud services and scalability features.

The EDM service provides data storage along with a flexible and searchable inventory/catalog of product metadata that is capable of supporting the integration of all NESDIS products. It leverages a variety of native AWS cloud services including Simple Storage Service (S3), Elasticsearch Service, Relational Database Service (RDS), Lambda, and API Gateway. The EPG service is capable of the generation of all NESDIS level 1+ sensor, science, and tailored products. It also leverages a variety of native AWS cloud services including Elastic Compute Cloud (EC2) with Auto-Scaling, and Simple Queue Service (SQS). The EDM and EPG services will provide NESDIS with added flexibility in provisioning computational processing, metadata management, and data storage on resources that best meet product, mission, and cost requirements.

The primary objectives of this EDM and EPG Proving Ground initiative were:
1. To leverage the flexibility and agility provided by a cloud environment to prototype candidate architectures and implementations for EDM and EPG services, and evaluate them for efficacy, performance, scalability, and maintainability.
2. To demonstrate the flexibility of the proposed EPG service to execute multiple types of algorithms, such as existing NDE 2.0 product algorithms, JPSS Risk Reduction algorithms, Enterprise Algorithm implementations of legacy products, and/or GOES-R L2+ Product Algorithms.

The secondary objectives of this EDM and EPG Proving Ground initiative were:
1. To consider how cloud-hosted EDM and EPG services could be used for collaboration and integration of future product generation algorithms, both within NESDIS and with collaborative research organizations.
2. To identify cost breakpoints for technology, ingress & egress, performance, etc.

This EDM and EPG Proving Ground initiative demonstrated the ability to run EDM and EPG services for the generation and management of NESDIS products at-scale in a FedRAMP-authorized cloud environment, which has the potential of reducing facility, hardware & software, license & maintenance, administrative, and operations lifecycle costs. S-NPP, JPSS-1, and GCOM-W data was ingested near-real-time into the AWS cloud-hosted EDM and EPG Proving Ground environment from the on-premise ESPDS Integration & Test (I&T) Environment at the NOAA Satellite Operations Facility (NSOF) in Suitland, MD. GOES-16 data was also ingested near-real-time into the AWS cloud-hosted EDM and EPG Proving Ground environment from the NOAA Big Data Project’s GOES-16 AWS S3 bucket. Numerous NOAA product generation algorithms for NOAA Geostationary and Polar-Orbiting missions were integrated into the system, and a 21-day test was conducted to demonstrate and test the system’s capabilities and measure AWS cloud costs under an operationally-equivalent load.

At the 2019 AMS Annual Meeting, Solers provided a briefing introducing this project, its architecture, and status to date. The project has since completed its development and integration, conducted the 21-day test with operationally-equivalent data ingest and product generation loads at-scale, and delivered a final report containing our findings and recommendations in April 2019. In this 2020 AMS Annual Meeting briefing, we will present the final architecture, 21-day test average daily data volume details, yearly cost estimation of running production product generation in the cloud leveraging this architecture (and the potential cost savings over the existing operational on-premise system in comparison), and share our lessons learned and recommendations.

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner