Dagstuhl Seminar 13381
Algorithms and Scheduling Techniques for Exascale Systems
( Sep 15 – Sep 20, 2013 )
- Henri Casanova (University of Hawaii at Manoa - Honolulu, US)
- Yves Robert (ENS - Lyon, FR)
- Uwe Schwiegelshohn (TU Dortmund, DE)
- Annette Beyer (for administrative matters)
Hardware manufacturers are currently deploying machines with sustained petascale performance, while already looking forward to produce exascale machines. However, these computing systems have already reached such a degree of sophistication that the conventional approach of hardware goes first and applications follow is likely to fail.
A key question is to determine which applications are best suited for execution on exascale systems. In addition to application areas like hierarchical numerical linear algebra with a proven history of success in parallel computing, it is also necessary to determine whether other areas can benefit from these machines. For instance, the increasing complexity of many new social and technical systems requires new discrete optimization techniques suitable for high performance computing systems. Exascale systems are likely to contain 105 to 106 processors, each processor itself being equipped with more than 100 cores, and possibly 103 to 104 GPU cores. At the node level, the architecture is a shared-memory machine, running many parallel threads on the cores and on the GPUs while at the machine level, the architecture is a distributed-memory machine. Therefore, these systems will exhibit a larger degree of heterogeneity than the massively parallel architectures of today. The interplay between the various components dramatically complicates the design of new versions of the standard algorithms that are at the heart of many scientific applications. In particular, core elements of existing applications must be modified so that they can form the building blocks of new exascale applications while new methods specifically targeting exascale systems must be developed for new application areas.
As failures are increasingly likely to occur during the execution of parallel jobs that enrol very large numbers of processors resilience is a key challenge for exascale systems. For instance, a failure would occur every 50 minutes on a machine with 106 processors even if the MTBF (Mean Time Between Failures) of a single processor is assumed to be as large as 100 years. Therefore, the design of efficient fault-tolerant scheduling strategies will be key to high performance. One interesting question is that of the sharing of responsibility for failure tolerance among the applications themselves and the management software/hardware for an exascale system.
Further, power consumption is a crucial issue for exascale systems. Although these systems are undoubtedly performance oriented, they may not be able to maintain their peak performance for a long period of time due to power constraints. Therefore, their management must support power saving approaches like dynamic voltage and frequency scaling (DVFS) and resource consolidation whenever possible. In the end, an application is supposed to make best use of the heterogeneity of the system, in order to run efficiently with respect to power and performance. The resulting multi-criteria scheduling problems must be solved by resource management components designed for exascale systems.
The goal of this seminar is to foster discussions and articulate novel and promising directions for addressing the three above challenges, application design, fault-tolerance, and power management, in the context of exascale systems.
Hardware manufacturers are currently deploying machines with sustained petascale performance while already looking forward to produce Exascale machines. Exascale systems are likely to contain 10^5 to 10^6 processors, each processor itself being equipped with more than 100 cores, and possibly 10^3 to 10^4 GPU cores. These systems already reach such a degree of sophistication and complexity that the conventional approach of hardware goes first and applications follow is likely to fail. Furthermore, application performance is no longer solely defined by time-to-solution but also by power consumption and resilience to fault. Many conferences and workshops are dedicated to the architecture and systems issues pertaining to Exascale computing. Instead, in this seminar we have discussed algorithmic issues (application parallelization, application scheduling, resource management, etc.) that must be addresses to make Exascale computing a tenable proposition. As seen in many of the presentations during the seminar, core elements or principles of existing applications must be modified so that they can form the building blocks of new Exascale applications while new methods specifically targeting Exascale systems must be developed for new application areas.
The presentations during the seminar covered a wide range of topics. Some of these topics were directly targeted to various aspects of "the Exascale problem". Some topics were targeted to components of the problem, e.g., efficient execution of application kernels on a heterogeneous many-core node. Finally, yet other topics were in broader, and often more theoretical, parallel and distributed computing contexts with less immediate but possibly large impact on the future of Exascale computing. Overall, the topics presented and discussed during the workshop can be roughly categorized as follows, noting that at least half the presentations spanned more than one of these topics:
- Fault-tolerance. Fault-tolerance is a major concern at large scale and several presentations focused on the limitations of current checkpoint-restart fault-tolerance techniques, providing analytical studies to quantify expected performance of these solutions and comparing them to proposed new solutions. These new solutions included, for instance, the use of algorithm-specific checkpointing combined with system-level checkpointing, or the use of imperfect fault predictors.
- Multi-criteria optimization. A large number of presentations presented multi-criteria optimization problems, including one traditional performance metric (throughput, makespan) and one (2-criteria) or two (3-criteria) metrics relating to power consumption and/or reliability. Several works studied the use of techniques such as DVFS to trade-off performance for a lower power consumption. These multi-criteria problems were formalized, and various theoretical and practical results were obtained in attempts to solve these problems. Two main approaches were followed:(i) optimizing one metric w.r.t. constraints on the other metric(s); or (ii) obtaining Pareto-optimal solutions or determining the entire Pareto front.
- Multiple cores. A handful of presentations focused on the above optimization problems not on large-scale platforms but on many-core nodes with shared memory, i.e., the intended individual components of future Exascale platforms. These nodes consist of possibly heterogeneous cores, accelerators (GPUs, etc.) connected via busses and on-chip networks.
- Novel scheduling results. A large number of presentations included novel findings regarding the complexity of scheduling problems. These scheduling problems are of general interest for various models of parallel computation, as motivated by the above topics. Results consisted of p-time optimal algorithms, new NP-completeness results, approximation algorithms, and efficient heuristics.
- Exascale scientific computing. Several presentations focused on particular scientific applications (e.g., PDE solvers) or scientific kernels (e.g., matrix multiplication), and discussed how age-old algorithms should be adapted to exploit Exascale platforms with heterogeneous components, hierarchical networks, and the need to have both efficient and rare communication primitive invocations. One presentation presented recent experience with scalable performance monitoring and performance debugging, capabilities that will be crucial in the practice of Exascale computing.
- Programming models for Exascale. A handful of presentations spoke to the need for novel programming models at large scale. These presentations spanned the spectrum from very (e.g., actual implementations of programming models usable today, proposals to enhance current programming standards) to theoretical (e.g., a new theoretical approach to assess the efficiency of techniques such as work stealing and least-loaded-machine-first scheduling when the number of compute nodes tends to infinity).
- Resource and application management. A handful of presentations discussed Exascale computing in the context of cloud computing. In other words, these presenters made a case for applying/evolving some of the concepts currently applied in cloud deployments to future Exascale platforms (e.g., service level agreements, virtualization, resource economy). A number of open problems were identified when trying to make these two "worlds" collide.
Although the presentations at the seminar were very diverse in scope, ranging from practice to theory, an interesting observation is that many works do establish strong links between practice (e.g., particular applications, programming models) and theory (e.g., abstract scheduling problems and results). For instance, it was found that the age-old numerical linear algebra topic, far from being well-understood, in fact gives rise to a range of interrelated and interesting practical and theoretical problems that must be solved conjointly to achieve efficiency at large scale. Such observations make it plain that forums that blends practice and theory, as is the case with this seminar, are very much needed.
The seminar brought together 41 researchers from Austria, France, Germany, Ireland, Morocco, Netherlands, New Zealand, Switzerland, U.K, and U.S.A. with interests and expertise in different aspect of parallel and distributed computing. Among participants there was a good mix of senior researchers, junior researchers, postdoctoral researchers, and Ph.D. students. Altogether there were 36 presentations over the 5 days of the seminar, organized in morning and late-afternoon sessions, plus an open problem session. The program was as usual a compromise between allowing sufficient time for participants to present their work, while also providing unstructured periods that were used by participants to pursue ongoing collaborations as well as to foster new ones. The feedback provided by the participants show that the goals of the seminar, namely to circulate new ideas and create new collaborations, were met to a large extent.
The organizers and participants wish to thank the staff and the management of Schloss Dagstuhl for their assistance and support in the arrangement of a very successful and productive event.
- Ismail Assayad (University Hassan II - Casablanca, MA) [dblp]
- Guillaume Aupy (ENS - Lyon, FR) [dblp]
- Olivier Beaumont (INRIA - Bordeaux, FR) [dblp]
- Anne Benoit (ENS - Lyon, FR) [dblp]
- George Bosilca (University of Tennessee, US) [dblp]
- Aurélien Bouteiller (University of Tennessee, US) [dblp]
- Heinrich Braun (SAP SE - Walldorf, DE) [dblp]
- Henri Casanova (University of Hawaii at Manoa - Honolulu, US) [dblp]
- Anthony Danalis (University of Tennessee, US) [dblp]
- Carsten Franke (ABB Corporate Research - Baden-Dättwil, CH) [dblp]
- Bruno Gaujal (INRIA - Grenoble, FR) [dblp]
- Dominik Göddeke (TU Dortmund, DE) [dblp]
- Christian Grimme (Universität Münster, DE) [dblp]
- Abdou Guermouche (University of Bordeaux, FR) [dblp]
- Amina Guermouche (University of Versailles, FR) [dblp]
- Thomas Herault (University of Tennessee, US) [dblp]
- Julien Herrmann (ENS - Lyon, FR) [dblp]
- Sascha Hunold (TU Wien, AT) [dblp]
- Emmanuel Jeannot (INRIA - Bordeaux, FR) [dblp]
- Thilo Kielmann (VU University Amsterdam, NL) [dblp]
- Thomas Lambert (ENS - Lyon, FR) [dblp]
- Alexey Lastovetsky (University College Dublin, IE) [dblp]
- Bradley Lowery (University of Colorado - Denver, US) [dblp]
- Anthony A. Maciejewski (Colorado State University, US) [dblp]
- Loris Marchal (ENS - Lyon, FR) [dblp]
- Alix Munier (UPMC - Paris, FR) [dblp]
- Wolfgang E. Nagel (TU Dresden, DE) [dblp]
- Jean-Marc Nicod (FEMTO-ST Institute - Besancon, FR) [dblp]
- Thomas Rauber (Universität Bayreuth, DE) [dblp]
- Paul Renaud-Goud (University of Bordeaux, FR) [dblp]
- Yves Robert (ENS - Lyon, FR) [dblp]
- Gudula Rünger (TU Chemnitz, DE) [dblp]
- Rizos Sakellariou (University of Manchester, GB) [dblp]
- Erik Saule (University of North Carolina - Charlotte, US) [dblp]
- Uwe Schwiegelshohn (TU Dortmund, DE) [dblp]
- Howard Jay Siegel (Colorado State University, US) [dblp]
- Oliver Sinnen (University of Auckland, NZ) [dblp]
- Veronika Sonigo (University of Franche-Comté - Besancon, FR) [dblp]
- Stefan Turek (TU Dortmund, DE) [dblp]
- Bora Uçar (ENS - Lyon, FR) [dblp]
- Frédéric Vivien (ENS - Lyon, FR) [dblp]
- Dagstuhl Seminar 15281: Algorithms and Scheduling Techniques to Manage Resilience and Power Consumption in Distributed Systems (2015-07-05 - 2015-07-10) (Details)
- optimization / scheduling
- High performance computing
- heterogeneous large-scale platforms
- highly parallel algorithms
- fault tolerance
- energy efficiency