A GPU implementation of the WRF-ARW dynamical core

Nipen, Thomas N.; Nipen, Thomas N.

Graphical processing units (GPUs) are massively parallel computing devices whose use has recently been explored in numerical weather prediction (NWP). Their parallel floating-point capabilities match well with the inherently parallel nature of NWP calculations and can result in significant computational speed-up. A GPU implementation of the WRF-ARW dynamical solver will be presented.

The implementation is based on limiting costly data transfer between the GPU and the host processor. Traditionally, data is transferred to and from the GPU for the parts of the code that is executed on the GPU. We present an implementation where all computation is executed on the GPU. Data transfer therefore only needs to occur once at the beginning and once at the end of the simulation, avoiding data transfers after each timestep. The model state therefore remains on the GPU for the duration of the simulation run. The role of the CPU is solely for input/output operations and to manage the execution of the GPU kernels. This allows for greater speed-up compared to a modular approach where some parts of the code execute on the host and the remaining parts on the GPU.

Preliminary speed-up results as well as programming challenges involved with this implementation are presented.

8B.4 A GPU implementation of the WRF-ARW dynamical core