1.1 NOAA's WCOSS Upgrade and Lessons Learned for Shared Storage and Managing Memory

Thursday, 14 January 2016: 8:30 AM
Room 344 ( New Orleans Ernest N. Morial Convention Center)
Michelle Mainelli, NOAA/NWS/NCEP, College Park, MD; and B. Kyger, S. Earle, and G. Vandenberghe

In the fall of 2015, NCEP Central Operations (NCO) led the upgrade of NOAA's Weather and Climate Operational Supercomputing System (WCOSS) to operate with a combined system of 2.8 PFlops and over 8 PB of storage at both geographically diverse sites in Reston, VA and Orlando, FL. Prior to the upgrade, NCO experienced both memory and storage issues on our High Performance Computing systems. Managing memory and the ability to move large amounts of data is critical to maintaining a stable production model suite. The presentation will provide a status of the latest upgrade, examples of how NCO fixed memory leaks in our codes and libraries, and the learning curve to improve metadata performance and optimally transfer a few hundred thousand files per day.
