Monday, 29 January 2024
Hall E (The Baltimore Convention Center)
Recent efforts in building data-driven surrogates for weather forecasting applications have received a lot of attention and garnered noticeable success. These autoregressive data-driven models yield significantly competitive short-term forecasting performance (often outperforming traditional numerical weather models) at a fraction of the computational cost of numerical models. However, these data-driven models do not remain stable when time-integrated for a long time. Such a long time-integration would provide (1) a method to seamlessly scale a weather model to a climate model and (2) gathering insights into the statistics of that climate system, e.g., the extreme events, owing to the cheap cost of generating multiple ensembles. While many studies have reported this instability, especially for data-driven models of turbulent flow, a causal mechanism for this instability is not clear. Most efforts to obtain stability are ad-hoc and empirical. In this work, we use a canonical quasi-geostrophic model to present a causal mechanism for this instability through the lens of a phenomenon called “spectral bias” in deep learning theory. We would show how spectral bias compounds the error in small scales which eventually, through inverse cascade, intensifies the errors in the large scales. Furthermore, we would show how the compounding of error would increase the intensity of meridional heat flux and momentum flux leading to an unstable and unphysical flow. We further perform rigorous theoretical eigen analysis to provide a framework to identify unstable autoregressive models and quantitatively predict the compounding error growth. In order to mitigate this issue, we provide a rigorous architecture agnostic (shown on different neural networks and neural operators) a three-pronged approach in which (1) a higher-order integrator is constructed within the model architecture, (2) a spectral regularizer is introduced to reduce the effect of spectral bias, and (3) a self-supervised optimization strategy is introduced, wherein the deepest layers are updated during autoregressive inference to correct the Fourier spectrum of the small scales. Finally, we would apply our approach to a data-driven model trained on ERA5 data as well as ocean reanalysis data and show optimistic results in both short-term performance and long-term stability opening new frontiers towards seamless weather-to-climate models.

