Wednesday, 25 January 2012: 1:45 PM
Consolidating Distributed Operational NWP Models Into Centralized HPCs: A Case Study
Room 356 (New Orleans Convention Center )
Keith R. Searight, NCAR, Boulder, CO; and J. C. Knievel, C. Borst, J. Exby, H. H. Fisher, R. Ruttenberg, J. C. Pace, S. F. Halvorson, and F. W. Gallagher
Since the late 1990s, NCAR and the U.S. Army Test and Evaluation Command (ATEC) have collaborated on the Four-Dimensional Weather (4DWX) software system, which ingests observations of the atmosphere and uses numerical weather prediction models to analyze and predict weather. It is based on the fifth generation Mesoscale Model (MM5) and on the Weather Research and Forecasting (WRF) Model. 4DWX originates with state-of-the art NWP research conducted at NCAR and culminates in operational output and products used by Army weather forecasters throughout the United States. One key aspect of this technology transfer is a staged process that injects new research techniques into an operational framework rigorously tested at each step. The result is a versioned system that delivers an optimally customized forecast suited to the particular needs at each local site.
For many years, 4DWX ran operationally on geographically distributed HPCs at the ATEC ranges, each running its own NWP model using local dedicated RAID storage. With more recent advances in HPC technology, it has become feasible to run multiple NWP models on centralized HPCs using shared network storage. This approach provides for much greater levels of efficiency, flexibility, and extensibility at only an incremental cost.
NCAR's experiences evaluating, designing, implementing, and transitioning from a distributed to centralized architecture will be described as a case study. The scientific requirements included forecast lengths, grid sizes, number of nested domains, and post-processing products. Some of the engineering considerations were connectivity between compute nodes, memory management, file transfer and management, job queuing and node assignment, system security, and distribution of model outputs to analysis and display systems. Pitfalls as well as successes will be summarized and thoughts about the future of HPCs for fielding operational NWP systems such as 4DWX will be offered.
Supplementary URL: