E100 Explainable Deep Learning for Climate Applications Using the Spectral Analysis of Regression Activations and Kernels (SpARK) Framework

Thursday, 1 February 2024
Hall E (The Baltimore Convention Center)
Yifei Guan, Rice University, Houston, TX; and A. K. Chattopadhyay, A. Subel, H. A. Pahlavan, and P. Hassanzadeh

The use of deep neural networks (NNs) in critical applications such as weather/climate prediction and turbulence modeling is growing rapidly. While some successful results have been shown in the past few years, two major concerns are the lack of interpretability and the inability of these NNs to work for other systems (i.e., to generalize). Here, we introduce a new framework that combines Spectral (Fourier) analyses of NNs and nonlinear physics, leveraging recent advances in theory and applications of deep learning, to move toward rigorous analysis of deep NNs for applications involving dynamical systems such as climate and turbulent systems. We will use examples from the subgrid-scale modeling of geophysical turbulence, atmospheric gravity waves, and weather forecasting, to show how this framework can be used to systematically address challenges around explainability, generalizability, and stability. For example, the framework shows that in many of such applications, millions of learned parameters in deep convolutional NNs reduce to a few classes of known spectral filters, such as low-pass and Gabor wavelets. This analysis enables us to explain what the NNs have learned in terms of the underlying physics. We use this framework, which is broadly applicable to a wide range of applications, to explain and improve techniques such as transfer learning (TL) and offline-online learning that make NN better generalize within and out-of-distribution. Our work provides a step toward fully explainable NNs, for wide-ranging applications in science and engineering, such as climate change modeling.

Fig. 1. Overview of the framework for guiding and explaining TL onto a new target system. The top row shows the steps of the TL process: acquiring a large amount of training data from the base system and a small amount from the target system, training a Base Neural Network (BNN) using data from the base system, and re-training it using data from the target system to obtain a Transfer-Learning Neural Network (TLNN). On the bottom, we present the analyses involved in this framework, listed (left to right) in the order of when they should be used. The arrows indicate what is needed from each step of the TL process and the corresponding analyses. Here, the blue line represents data from the target system, the red line represents the trained BNN, and the range line represents the re-trained TLNN.

Acknowledgments

This work was supported by an award from the ONR Young Investigator Program (N00014-20-1-2722), a grant from the NSF CSSI program (OAC-2005123), and by the generosity of Eric and Wendy Schmidt by recommendation of the Schmidt Futures program.

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner