819 Using Machine Learning to Derive Linearized Physical Parameterizations

Tuesday, 14 January 2020
Hall B (Boston Convention and Exhibition Center)
Victor Marchais, UCAR, Boulder, CO; and D. Holdaway and T. Auligné

Linearized Earth system models, comprising of the tangent linear model (TLM) and adjoint, are required in a number of important applications, for example 4DVar data assimilation, sensitivity studies, computation of singular vectors and forecast sensitivity observation impacts. For some components of the Earth system, for instance the physical parameterization schemes, developing the tangent linear and adjoint is a significant burden. The algorithms can be highly nonlinear and linearly unstable; the schemes often change, meaning the linearized versions have to be redeveloped; and the resulting code can be very slow. Neural networks offer an attractive alternative to developing linearized physics scheme. They typically rely on the adjoint of the infrastructure in training, making access to the gradient somewhat straightforward; they can be designed to be relatively smooth and continuous; retraining as the underlying schemes change would require little manual effort; and they offer extremely fast execution times. In order to assess the use of neural networks to produce linearized versions of the physics we train a neural network on the longwave Rapid Radiative Transfer Model for GCMs (RRTMG).

The structure of atmospheric profile data fits both convolution and recurrent neural network architectures. Convolutions generally give smoother results and are chosen given their previous success in this kind of problem. Using physical knowledge to customize the architectures it is possible to emulate the RTTMG model with very little loss in accuracy. U-Net architecture, which has the property of capturing low and high frequency patterns, gives the best performance when compared with a number of other candidate architectures.

In order to assess whether a candidate neural network would produce the correct TLM and adjoint the Jacobian of the original scheme was compared with the Jacobian of the neural network. In both cases the Jacobian is obtained using a finite difference approach with infinitesimal perturbations. These ‘nonlinear’ Jacobians are compared with the actual TLM of the neural network, obtained directly through Keras/TensorFlow. While the nonlinear neural network Jacobian would agree very well with the gradient of the neural network, neither matched well with the Jacobian of the model being trained on, despite rigorous checks for overfitting. The findings of this work suggest it may not be possible to achieve a neural network that produces a useful linearized model using standard methods.

Rather than implicitly obtain the linearized model the U-Net is trained to produce the nonlinear Jacobian directly. Producing the Jacobian is equivalent to having both the TLM and the adjoint. The problem becomes similar to image generation which is well suited to U-Nets. This time we use a 2D-Unet to produce the output.

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner