3.4 E3SM Co-Design: Pathfinding and Evaluation of ARM Ecosystem

Tuesday, 8 January 2019: 3:45 PM
North 123 (Phoenix Convention Center - West and North Buildings)
Sarat Sreepathi, ORNL, Oak Ridge, TN; and P. W. Jones and M. A. Taylor

The Energy Exascale Earth System Model (E3SM) is a high-resolution coupled Earth system model, designed to address energy-related science challenges and to effectively use Department of Energy (DOE) supercomputers. Co-design refers to a design methodology, wherein there exists a feedback loop between applications, system software and underlying computer architecture through which application requirements influence hardware design and in turn technology choices and constraints guide problem formulation and design of algorithms.

Representative kernels and well-designed mini-apps that reflect critical aspects of E3SM are a critical feedback mechanism for effective co-design. A synergistic effort is undertaken under the auspices of the E3SM next generation development and related sub-project within Exascale Computing Project to extract relevant performance-critical kernels from the model components and use them for pathfinding, technology evaluation and optimization on emerging architectures and software ecosystems.

Additionally, there is an increasing diversity of computational architectures and associated software toolchains that are under consideration by DOE for future procurements like CORAL-2 in the exascale timeframe (2021-2023). An ARM testbed cluster (Wombat), has been installed at Oak Ridge Leadership Computing Facility to support research projects aimed at exploring the ARM architecture. This talk will present pathfinding work using E3SM kernels and mini-apps as well as detail platform readiness activities for the fully coupled earth system model on Wombat. We will report findings from ongoing investigation into suitable ecosystem compilers like ARM HPC, Flang and GNU. Furthermore, we will present a comparative analysis of E3SM performance on ARM in contrast to traditional CPU architectures like x86 and Power.

- Indicates paper has been withdrawn from meeting
- Indicates an Award Winner