J15C.2 Operationalizing a Machine Learning Approach to Post-Processing High Resolution NWP Forecasts

Thursday, 1 February 2024: 2:00 PM
327 (The Baltimore Convention Center)
Luke Conibear, Tomorrow.io, Sheffield, United kingdom; and A. E. Payne, A. Reed Harris, K. Keshavamurthy, M. E. Green, MA, T. McCandless, and S. Flampouris

Handout (863.4 kB)

We present an operationalized machine learning model that post-processes high-resolution, deterministic weather forecasts to produce probabilistic forecasts for seven core weather variables over the contiguous United States. This approach combines the strengths of high-resolution Numerical Weather Prediction (NWP) modeling with machine learning to generate more accurate deterministic and probabilistic forecasts, thus adding substantial predictive and actionable information for stakeholders. To maximize the value of such forecasts, , the model was required to run operationally with minimal forecast latency and cost. Our productionization process involved refactoring from notebooks to modular scripts, adding a full unit test suite, experiment tracking, custom libraries, containerization (Docker), continuous integration (GitHub Actions), continuous deployment (Kubernetes), infrastructure-as-code (Terraform), cost analysis, and performance optimizations. As a result, our distributed training process using Horovod takes 8 hours on eight NVIDIA V100 graphics processing units (GPUs) using mixed precision at >60% utilization (approximately $65 on SageMaker as of June 2023 with spot training). We run inference on a batch of timesteps to create a complete forecast, which takes 5 minutes on one Cascade Lake central processing unit (CPU), demonstrating efficient and cost-effective post-processing of high-resolution NWP models with ML to generate probabilistic forecasts.
- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner