TOP
Search the Dagstuhl Website
Looking for information on the websites of the individual seminars? - Then please:
Not found what you are looking for? - Some of our services have separate websites, each with its own search option. Please check the following list:
Schloss Dagstuhl - LZI - Logo
Schloss Dagstuhl Services
Seminars
Within this website:
External resources:
  • DOOR (for registering your stay at Dagstuhl)
  • DOSA (for proposing future Dagstuhl Seminars or Dagstuhl Perspectives Workshops)
Publishing
Within this website:
External resources:
dblp
Within this website:
External resources:
  • the dblp Computer Science Bibliography


Dagstuhl Seminar 17431

Performance Portability in Extreme Scale Computing: Metrics, Challenges, Solutions

( Oct 22 – Oct 27, 2017 )

(Click in the middle of the image to enlarge)

Permalink
Please use the following short url to reference this page: https://www.dagstuhl.de/17431

Organizers

Contact



Motivation

Performance Portability is a critical new challenge in extreme-scale computing. In essence, performance-portable applications can be efficiently executed on a wide variety of HPC architectures without significant manual modifications. For nearly two decades, HPC architectures and programming models remained relatively stable, which allowed growth of complex multidisciplinary applications whose lifecycles span multiple generations of HPC platforms.

Recently, however, platforms are growing much more complex, diverse, and heterogeneous - both within a single system and across systems and generations. Details already known from planned future systems indicate that this trend will continue (at least for the foreseeable future). Current and planned future large-scale HPC systems consist of complex configurations with a massive number of components. Each node has multiple multi-core sockets and often one or more additional accelerator units in the form of many-core nodes or GPGPUs, resulting in a heterogeneous system architecture. Memory hierarchies including caches, memory, and storage are also diversifying in order to meet multiple constraints: power, latency, bandwidth, persistence, reliability, and capacity. These factors are reducing portability, and forcing applications teams to either spend considerable effort porting and optimizing their applications for each specific platform, or risk owning applications that perform well on perhaps only one architecture.

This Dagstuhl Seminar represents a unique opportunity to bring together international experts from the three research communities essential to tackling this performance portability challenge: developers of large-scale computational science software projects whose lifetime will span multiple generations of systems, researchers developing relevant parallel programming or system software technologies, and specialists in profiling, understanding, and modelling performance. The major research questions for the seminar are:

  • To understand challenges, design metrics, and prioritize potential solutions for performance portability: Solutions will need to synthesize existing concepts across multiple fields, including performance and productivity modeling, programming models and compilation, architectures, system software.
  • Management of data movement in complex applications: Diverse data movement patterns dictated by different devices form one of the largest impediments to portable performance. Addressing it will require cross-cutting solutions supporting more than one abstraction, and will allow scientists to balance tradeoffs in these factors prior to design, development, or procurement of an architecture, software stack, or application.
  • Composability: Many applications require flexibility and composability because they address different physical regimes either within the same simulation, or in different instances of simulations.
  • Pathways to impact on the research community: As the community becomes more reliant on both more complex architectures and software stacks, it is especially important that we develop the conceptual tools to facilitate research and practical solutions for performance portability. The impact of ignoring this topic could be potentially devastating to the quality and sustainability of computational science software, and consequently on the science and engineering research they support. Thus a key element of the seminar will be to tackle this challenge in major science community software projects.
Copyright Anshu Dubey, Paul H. J. Kelly, Bernd Mohr, and Jeffrey S. Vetter

Summary

This report documents the program and the outcomes of Dagstuhl Seminar 17431 "Performance Portability in Extreme Scale Computing: Metrics, Challenges, Solutions".

Performance Portability is a critical new challenge in extreme-scale computing. In essence, performance-portable applications can be efficiently executed on a wide variety of HPC architectures without significant manual modifications. For nearly two decades, HPC architectures and programming models remained relatively stable, which allowed growth of complex multidisciplinary applications whose lifecycles span multiple generations of HPC platforms.

Recently, however, platforms are growing much more complex, diverse, and heterogeneous - both within a single system and across systems and generations. Details already known from planned future systems indicate that this trend will continue (at least for the foreseeable future). Current and planned future large-scale HPC systems consist of complex configurations with a massive number of components. Each node has multiple multi-core sockets and often one or more additional accelerator units in the form of many-core nodes or GPGPUs, resulting in a heterogeneous system architecture. Memory hierarchies including caches, memory, and storage are also diversifying in order to meet multiple constraints: power, latency, bandwidth, persistence, reliability, and capacity. These factors are reducing portability, and forcing applications teams to either spend considerable effort porting and optimizing their applications for each specific platform, or risk owning applications that perform well on perhaps only one architecture. The latter option would still require porting and optimizing effort for each new generation of systems.

This Dagstuhl Seminar represented a unique opportunity to bring together international experts from the three research communities essential to tackling this performance portability challenge: developers of large-scale computational science software projects whose lifetime will span multiple generations of systems, researchers developing relevant parallel programming or system software technologies, and specialists in profiling, understanding, and modelling performance. The major research questions for the seminar were:

  • To understand challenges, design metrics, and prioritize potential solutions for performance portability: Solutions will need to synthesize existing concepts across multiple fields, including performance and productivity modeling, programming models and compilation, architectures, system software.
  • Management of data movement in complex applications: Diverse data movement patterns dictated by different devices form one of the largest impediments to portable performance. Addressing it will require cross-cutting solutions supporting more than one abstraction, and will allow scientists to balance tradeoffs in these factors prior to design, development, or procurement of an architecture, software stack, or application.
  • Composability: Many applications require flexibility and composability because they address different physical regimes either within the same simulation, or in different instances of simulations.
  • Pathways to impact on the research community: As the community becomes more reliant on both more complex architectures and software stacks, it is especially important that we develop the conceptual tools to facilitate research and practical solutions for performance portability. The impact of ignoring this topic could be potentially devastating to the quality and sustainability of computational science software, and consequently on the science and engineering research they support. Thus a key element of the seminar will be to tackle this challenge in major science community software projects.

The seminar started with a series of flash talks, where participants introduced themselves in a two-minute one-slide presentation summarizing their contribution or interest in the seminar by providing two to three bullet points on (i) Challenge/Opportunity (WHY?) (ii) Timeliness (WHY NOW?) (iii) Approaches (HOW?) and (iv) IMPACT (SO WHAT?). Each day started with a longer keynote presentation by a representative of one of the major stakeholders in the field, followed by short presentations by participants grouped in sessions with a common relevant theme. Each keynote or short talk session ended with an extensive question-and-answer session and open discussion slot in which all the speakers from the session took part.

The overall conclusion shared by all participants was that performance portability in extreme scale computing can be achieved, especially if parallel applications are designed with performance portability in mind from the beginning. Model complexity and performance portability both require that frameworks be designed with composable components incorporating layers of abstraction so that trade-offs can be reasoned about. Making legacy application performance portable still requires enormous efforts and expertise. In many instances it will likely require extensive refactoring. Similar design principles regarding formulation of a flexible and composable framework apply for legacy software refactoring, along with strong emphasis on rigorous verification built into the process. The seminar recognized the challenges faced by the applications in adopting abstractions; converting research prototypes to reliable production-grade product. The adverse structure of incentives for both applications and abstractions, and the complexity of formulating a process or collaboration between the two communities, may be bigger barriers than technical challenges in making performance portability feasible. It is critical that the involved communities and stakeholders are made aware of these challenges while seeking solutions for sustainable computational science projects.

Copyright Anshu Dubey, Paul H. J. Kelly, Bernd Mohr, and Jeffrey S. Vetter

Participants
  • Sadaf Alam (CSCS - Lugano, CH) [dblp]
  • Michael Bader (TU München, DE) [dblp]
  • Carlo Bertolli (IBM TJ Watson Research Center - Yorktown Heights, US) [dblp]
  • Mauro Bianco (CSCS - Lugano, CH) [dblp]
  • Alexandru Calotoiu (TU Darmstadt, DE) [dblp]
  • Bradford Chamberlain (Cray Inc. - Seattle, US) [dblp]
  • Aparna Chandramowlishwaran (University of California - Irvine, US) [dblp]
  • Kemal A. Delic (Hewlett Packard - Grenoble, FR) [dblp]
  • Christophe Dubach (University of Edinburgh, GB) [dblp]
  • Anshu Dubey (Argonne National Laboratory, US) [dblp]
  • H. Carter Edwards (Sandia National Labs - Albuquerque, US) [dblp]
  • Jan Eitzinger (Universität Erlangen-Nürnberg, DE) [dblp]
  • Todd Gamblin (LLNL - Livermore, US) [dblp]
  • Lin Gan (Tsinghua University - Beijing, CN) [dblp]
  • William D. Gropp (University of Illinois - Urbana-Champaign, US) [dblp]
  • Philipp Gschwandtner (Universität Innsbruck, AT) [dblp]
  • Mary W. Hall (University of Utah - Salt Lake City, US) [dblp]
  • Robert J. Harrison (Brookhaven National Laboratory - Upton, US) [dblp]
  • Alexandra Jimborean (Uppsala University, SE) [dblp]
  • Paul H. J. Kelly (Imperial College London, GB) [dblp]
  • Andreas Klöckner (University of Illinois - Urbana-Champaign, US) [dblp]
  • Kathleen Knobe (Rice University - Houston, US) [dblp]
  • Seyong Lee (Oak Ridge National Laboratory, US) [dblp]
  • Naoya Maruyama (LLNL - Livermore, US) [dblp]
  • Chris Maynard (MetOffice - Exeter, GB) [dblp]
  • Simon McIntosh-Smith (University of Bristol, GB) [dblp]
  • Richard Membarth (DFKI - Saarbrücken, DE) [dblp]
  • Lawrence Mitchell (Imperial College London, GB) [dblp]
  • Bernd Mohr (Jülich Supercomputing Centre, DE) [dblp]
  • Raymond Namyst (University of Bordeaux, FR) [dblp]
  • Simon Pennycook (Intel - Santa Clara, US) [dblp]
  • Istvan Reguly (Pazmany Peter Catholic University - Budapest, HU) [dblp]
  • P. (Saday) Sadayappan (Ohio State University - Columbus, US) [dblp]
  • Sven-Bodo Scholz (Heriot-Watt University - Edinburgh, GB) [dblp]
  • Michelle Mills Strout (University of Arizona - Tucson, US) [dblp]
  • Nathan Tallent (Pacific Northwest National Lab. - Richland, US) [dblp]
  • Christian Terboven (RWTH Aachen, DE) [dblp]
  • Didem Unat (Koc University - Istanbul, TR) [dblp]
  • Ana Lucia Varbanescu (University of Amsterdam, NL) [dblp]
  • Jeffrey S. Vetter (Oak Ridge National Laboratory, US) [dblp]
  • Mohamed Wahib (AIST - Tokyo, JP) [dblp]
  • Michele Weiland (University of Edinburgh, GB) [dblp]
  • Robert Wisniewski (Intel - Santa Clara, US) [dblp]
  • Michael Wolfe (NVIDIA Corp., US) [dblp]

Classification
  • modelling / simulation
  • optimization / scheduling
  • software engineering

Keywords
  • performance portability
  • productivity
  • parallel programming
  • scientific computing