We have been applying deep learning to the problem of precipitation nowcasting, with the objective of high-resolution 1 to 3 hour forecasts of rain/snow in the continental United States at 1km spatial resolution. Observational data includes both remote sensing and ground stations, including GEO and LEO satellites, doppler radar, rain gauges, geographical and climate information. We use machine learning architectures originally developed for computer vision applications, using interpolation to match spatial resolutions.
We pose the forecasting problem as a prediction on histories of images. For a precipitation nowcast, a typical input history includes 2 hours from any or all of the following a) 8 GOES-16 GEO satellite images (15 min resolution), b) 8 or more doppler radar images (2 minute resolution), c) 1 IMERG/GPM LEO satellite image (~3 hours), d) PRISM climate (the current’s month’s 30 year climate, produced in the prior year), e) elevation (>1 month). Each input observation is paired with its relative measurement time. The output history includes 3 hours of predicted radar images and/or rain gauge measurements.
We improve on similar previous Deep Learning approaches in multiple ways. First, to resolve the temporal differences between the satellite data and radar data, we explicitly include the time lag in our model as opposed to simply using interpolating to force one source to temporally match the other. Second, we also apply Deep Learning directly to the forecasting of precipitation in the future from observations in the present, as opposed to relying on non-ML approaches like optical flow.
Deep Learning methods have shown an uncanny ability to provide predictions based purely on data without the need to add domain-specific knowledge into the models. This allows us to build models without the complexity of adding atmospheric physics and instead rely on the models to capture these dynamics directly from observational data. In the talk, we describe the formulation of the machine learning problem, the deep learning models used, and how we handle grids at disparate time scales, latencies, and resolutions.
We end with a short discussion of the results of evaluating models on 2019 data after training them on 2018 data. Our short-term forecasts (<3 hours) compare favorably with numerical methods. A significant reason is because deep learning offers very low latency (compute latency <1 minute), and thus the forecast is based on input data that is fresh relative to NWP. There is a crossover point where NWP begins to outperform our ML models; in our benchmarks against HRRR the crossover is around 3 hours. In this work, we investigate a pure Deep Learning approach, but future work includes investigating hybrid DL and NWP approaches, which may yield even stronger models than either approach alone.