171 Design a Fast Multi-Radar Gridding Algorithm on Modern CPU and GPU Hardware

Tuesday, 29 August 2017
Zurich (Swissotel Chicago)
Jingyin Tang, Univ. of Florida, Gainesville, FL; and K. Park, C. J. Matyas, and M. Schneider

In the U.S., approximately 160 Weather Surveillance Radar - 1988 Doppler (WSR-88D) units generate continuous data stream at a rate about 2TB/day. Converting weather radar gates to high-resolution three-dimensional grids at real-time is critical to decision warning applications. It is ideal to complete the gridding algorithm as fast as possible in order to leave sufficient margin for following up algorithms. In this research, we design and implement a high-efficient gridding algorithm in an asynchronous, parallel style. This algorithm employs modern computing concepts (e.g. Single-Instructor-Multiple-Datastream, Advanced Vectorization Extension assembly instructors), a carefully designed workflow (e.g. asynchronous data transfer), a fine tuned memory access control and an inverse-interpolation strategies. The interpolation is the most intensive computation in the gridding algorithm. In this research, we use a spheroid-based no-gap beam filling algorithm to inverse calculate from a grid’s geographic position (latitude, longitude and altitude) to a virtual-beam-like radar-centric coordinates (elevation, azimuthal and range gate). The weighting function is (1) exponentially penalized by angular difference from virtual-beam to surrounding beams, (2) inverse-distance weighted by gate difference, (3) linear weighted by temporal difference. Instead of search nearest gates to interpolate to a cell, a radar gate propagates its contribution of weights and weighted echo (e.g. reflectivity) to surrounding cells. This interpolation function can be fully parallelized and vectorized with a large number of threads. A swapping buffer data structure is employed to buffer data input from disk to minimize I/O wait in the algorithm implementation. We carefully tuned the memory access pattern to avoid memory cache miss as many as possible, which brings 70% performance improvements. The preliminary performance test shows that a single workstation that equipped GTX 1070 graphics card can generate nationwide radar mosaic at 500x500x250m resolution at every 2~5 minutes.
- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner