Wednesday, 14 January 2009: 4:15 PM
Rapid-response urban CFD simulations using a GPU computing paradigm on desktop supercomputers
Room 124A (Phoenix Convention Center)
High-performance computing is radically changing, thanks to new programming models and advances made in the Graphics Processing Units(GPU) hardware. GPUs that are traditionally designed for graphics rendering have emerged as massively-parallel "co-processors" to the Central Processing Unit (CPU). Small-footprint desktop supercomputers with hundreds of stream processors that can deliver teraflops peak performance at the price of conventional workstations have been realized. With such a large performance to price ratio comes many opportunities to advance the atmospheric transport and dispersion in urban environments. A computational fluid dynamics (CFD) simulation capability with a rapid computational turn-around time on small-footprint computing systems has the potential to transform emergency response and hazard zone prediction for contaminant dispersion in urban environments. In this study, we describe the development of a novel Cartesian grid CFD code for complex urban environments using the GPU computing paradigm. Specifically, we adopt the NVIDIA CUDA programming model to implement the discretized form of the Navier-Stokes equations on desktop supercomputers with up to four GPUs. Device communication is done with POSIX threading to improve scaling with multiple GPUs. Harnessing the full compute-potential of GPUs requires a clear understanding of fundamentally new GPU programming models, device architectures and memory-access patterns. Our results have confirmed the tremendous compute-potential of the GPU computing paradigm with two orders of magnitude speedup over a serial CFD code executed on a conventional CPU. In the extended abstract, we will present the details of our CFD code implementation that specifically targets the NVIDIA's GPU architecture for improved performance, and demonstrate the computational speed-up with respect to a serial CPU code written in C programming language for the purpose of this study. The rapid-response computational capability will be systematically validated against benchmark cases.
Supplementary URL: